Hello all, this is my first post and I feel a little bad about asking for help (instead of offering help), but I've reached writer's block.
If I formatted this post incorrectly, I apologise...

Okay here's what I need to get/have/do...

I need to write a c++ program that will read in a text file and determine the frequently used words (words that account for at least 1% of all the words in the file). So basically, it must read the file and print out the freqently used words with their number of occurences,

I 've been told to use two arrays (1 for each word read in and the other for keeping tack of how many times each word appears).
Here's what I think/know should appear.

for (pos=0;pos<len;pos++)
 		if(word[pos]>=65&&word[pos]<=90)
 			word[pos] + 32;
 
 
 	if((word[len-1]<97 ||word[len-1]>122) &&(word[len-1]<48||word[len-1]>57))
 			word[len-1]='\0';
 
 
 	for (p=0;p<=wordarray c+; p++)
 		if (A2[p]>=one percent)
 			cout <<setw(4)<<A2[p]<<" "<<A1[p]<<endl;

the first bit is for changing all of the letters to lower case letters
the second bit is for getting rid of punctuation marks at teh end of each read-in word
and the third is for printing out the final result part.

As you can see, those parts aren't perfect (the third one needs correcting - the "wordarrayc+" and "once percent" parts seem odd/ not sure how I'd write those).

Also, A1 and A2 should be the arrays.

I'm not sure how I'd make the loops to read in all of the words to each array nor do I even know how to create/declare the arrays.

Now that you (hopefully) have an idea of what I'm going to need to do, here's a few areas where I'm unsure...

*I'm not sure how I'd set up the array to read in and assign each word to a slot (especially when I don't know exactly how many words are in the text file).

*how to make sure that as a word is read in multiple times, that count is translated into the second array, while keeping track of the totla amount of words found in the document...

I'll post more questions as they arise. Please note that I am not wanting someone to simply write the code that would be the answer for all my problems, but rather give an example and/or explain things to me or how I should approach things...

many many thanks,
66

I ended up finding a way to not have to use arrays for this to be done. I ended up getting the program to run and get what needed to be done completed, but I'd still ilke to see your input as to how I'd approach this with use of 2 arrays...

thanks again

If you have learned about liked lists then using a linked list to hold the words and their counts would be more efficient. But if you haven't learned about them yet, you can just use a simple array of a very large number -- say an array big enough to hold 255 words. It doesn't matter if the array is too big, but cannot be too small, so if the file has more than that you will have to increase the array size -- maybe up to 512.

char *array[255] = {0};
int counts[255] = {0}

The above allocates an array of 255 pointers and initailizes them all to 0. Now, when the program reads a word from the file, search the array to see if the word is already in the array. If the word is not in the array, then add it in the next available slot. When you add the word, don't just simply copy the pointer but allocate memory with either malloc() (C programs) or new (C++ programs) and copy the string.

Yet another solution similar to above but make the size of the array dynamic so that it increased when a new word is added

char **array = 0;
int array_size = 0;
int *counts = 0;

when you add a word you have to also expand the arrays. array_size contains the current number of elements in the array.

char **array = 0;
int array_size = 0;
int *counts = 0;

// add a word
array_size++;
array = (char **)realloc(array, array_size );
counts = (int *)realloc(counts,array_size);

Pick:

>I need to write a c++ program....I 've been told to use two arrays

The two arrays could be represented by two std::strings?

This article has been dead for over six months. Start a new discussion instead.