| | |
Determining the number of unique words in a .txt file
Please support our C++ advertiser: Intel Parallel Studio Home
![]() |
>I tried a different Approach to achieve the same
It's not only "different" approach: it's a wrong approach
.
The C Standard (7.4.1.10):
Therefore the code above can't select word in "123four5six..." string. Also it has obviously incorrect lines, for example:
More subtle defect of the code above is that isXXX family function do not work for negative arguments. If a character in a text file has a bit value '1xxxxxxx' and implementation char type is signed then [icode]*ptrIter[/code] expression gets negative integer and [icode]isalpha(*ptrIter)[/code] result is undefined. That's why Uchar typedef was defined in my code.
In actual fact it's inaccurate implementation of the same (scanner-like) approach
Apropos, if we have text file with whitespace separators only, no need in scanner-like methods at all. The simplest code works fine:
It's not only "different" approach: it's a wrong approach
.The C Standard (7.4.1.10):
C++ Syntax (Toggle Plain Text)
The standard white-space characters are the following: space (' '), form feed ('\f'), new-line ('\n'), carriage return ('\r'), horizontal tab ('\t'), and vertical tab ('\v'). In the "C" locale, isspace returns true only for the standard white-space characters.
c++ Syntax (Toggle Plain Text)
while (ptrFirst) { // probably, must be *ptrFirst
In actual fact it's inaccurate implementation of the same (scanner-like) approach

Apropos, if we have text file with whitespace separators only, no need in scanner-like methods at all. The simplest code works fine:
c++ Syntax (Toggle Plain Text)
string word; while (file >> word) { // process word }
Last edited by ArkM; Dec 4th, 2008 at 9:14 am.
•
•
Join Date: Jun 2006
Posts: 147
Reputation:
Solved Threads: 20
My Bad with the Line
Agreed with C99 standard, But the requirement doesn't say anything regarding the numeric separated words. thats why I've implemented the code this way, again your fstream approach is simplest but the thing is that I've tried to use the char* instead of streams provided function.
Thanks.
by the way I didn't compile it.
C++ Syntax (Toggle Plain Text)
while (ptrFirst) {
Thanks.
by the way I didn't compile it.
Strictly speaking, there is 1 word per line in those strange requirements (linear search only, dont use string, cant use for loop etc - it's enough to make you weep). If so no need in word extraction code at all.
Well, if you don't like "stream-based approach", don't use C++ fstream to get lines from a file. Use fgets or what else from C stuff. Furthemore, it's so easy to adopt the code for C-string scan: change f.get(c) to the next char extraction code with null byte test. Oh, sorry, I forgot: don't use istringstream! Don't use C++ at all
...
Well, if you don't like "stream-based approach", don't use C++ fstream to get lines from a file. Use fgets or what else from C stuff. Furthemore, it's so easy to adopt the code for C-string scan: change f.get(c) to the next char extraction code with null byte test. Oh, sorry, I forgot: don't use istringstream! Don't use C++ at all
... Is it programming learning basics: don't use for loops, don't use this, don't use that... and so on? It's a profanation.
Better download and read well-known B.Stroustrup's article "Learning Standard C++ as a New Language":
http://www.research.att.com/~bs/new_learning.pdf
Better download and read well-known B.Stroustrup's article "Learning Standard C++ as a New Language":
http://www.research.att.com/~bs/new_learning.pdf
•
•
Join Date: Jun 2006
Posts: 147
Reputation:
Solved Threads: 20
Thanks ArkM I've gone through this article of Bjarne, no contradiction with this document at all, but as an experienced programmer what do you think of requirements,
practically speaking
"One day your boss come to your desk and ask I've bought a library written in C and I want you to use that for blah blah?"
or what if your boss ask you to develop a C library itself ?
I am not telling you that C is superior than C++, but the thing that matter is requirements, if someone asks for C code teach them C but also provide them with the C++ implementation and the differences between the two. I think this is better learning approach.
practically speaking
"One day your boss come to your desk and ask I've bought a library written in C and I want you to use that for blah blah?"
or what if your boss ask you to develop a C library itself ?
I am not telling you that C is superior than C++, but the thing that matter is requirements, if someone asks for C code teach them C but also provide them with the C++ implementation and the differences between the two. I think this is better learning approach.
•
•
Join Date: Nov 2008
Posts: 13
Reputation:
Solved Threads: 0
Thanks for all the input. I was able to store the lines with strcpy(). but now I'm trying to use strncmp to find out the number of unique words (or lines) in the text file. I tried a couple of things, but none seemed to work. I'm given these guideline-
You must use the linear search algorithm to determine if a word is in the array. Remember that the array is an array of structures and that the key is a string (char array) so the string comparison must be used. The search task should be a separate function.
The search must be a separate function that returns an integer values. Do not use a for loop and the function must have only one return statement.
Heres the instructors linear search-
I'm really having trouble on this, any help would be appreciated.
You must use the linear search algorithm to determine if a word is in the array. Remember that the array is an array of structures and that the key is a string (char array) so the string comparison must be used. The search task should be a separate function.
The search must be a separate function that returns an integer values. Do not use a for loop and the function must have only one return statement.
Heres the instructors linear search-
C++ Syntax (Toggle Plain Text)
int search (int list [], int size, int key) { int pos = 0; while (pos < size && list[pos] != key) pos++; if (pos == size) pos = -1; return pos; }
•
•
Join Date: Jul 2005
Posts: 1,681
Reputation:
Solved Threads: 264
C++ Syntax (Toggle Plain Text)
bool search(char ** words, int numWords, char * currentWord) { bool found = false; int i = 0; //compare current word to each word already in array while(!found && i < numWords) //if current word is found if(strcmp(currentWord, words[i]) == 0) //change flag to end loop found = true; return found; }
Last edited by Lerner; Dec 4th, 2008 at 7:23 pm.
Klatu Barada Nikto
•
•
•
•
"One day your boss come to...for blah blah?" or what if your boss ask you to develop a C library itself ?
. But I never teach my team young members with "don't use for loops in C" methodology. No C specific style in those absurd requirements. Do you really think that bad_programming == C and good_programming == C++?Now let's remember: this is the C++ language thread and we are talking about C++ here.
•
•
Join Date: Nov 2008
Posts: 13
Reputation:
Solved Threads: 0
Ok, heres my revised code-
My search functions still arent giving me the results I want, it just returns the number of words, not unique words.
C++ Syntax (Toggle Plain Text)
#include <iostream> #include <fstream> #include <cstdlib> #include <string> #include <iomanip> using namespace std; int const wordLength = 21; int const Num = 100; int const fileSize = 255; struct words { char word[wordLength]; int count; }; int storeFile( char [], words []); void wordSearchSetup(char[], int, words[]); int wordSearch(char[], int, words[]); void main () { int count; char fileName[fileSize]; cout << "Please enter the name of the file you wish to open: "<< endl; cin.getline(fileName,fileSize); words array[Num]; count = storeFile (fileName, array); cin.ignore(); } int storeFile (char fileName[], words array[] ) { int count = 0; int i = 0; ifstream inFile; char line [Num]; inFile.open(fileName); while (inFile.getline(line,Num)) { strncpy(array[i].word, line, wordLength); count++; i++; } inFile.close(); wordSearchSetup( fileName, count, array); return count; } void wordSearchSetup(char fileName[], int count, words array[]) { char line[Num]; int i = 0; int size; ifstream inFile; inFile.open(fileName); while (inFile.getline(line,Num)) size = wordSearch(line, count, array); cout << size << endl; } int wordSearch( char line[], int count, words array[]) { int i = count; while (i) { if (strcmp(line, array[i-1].word) == 0) { array[i-1].count++; return count; } i-- ; } strcpy(array[count].word, line) ; array[count].count = 1 ; return count+1; }
![]() |
Other Threads in the C++ Forum
- Previous Thread: Dynamic arrays in a class
- Next Thread: passing fstream to a function and searching for a string
| Thread Tools | Search this Thread |
api array arrays based beginner binary bitmap c++ c/c++ calculator char char* class classes code coding compile compiler console conversion convert count data database delete deploy developer dll download dynamiccharacterarray email encryption error file forms fstream function functions game getline givemetehcodez graph homeworkhelp homeworkhelper iamthwee ifstream input int integer java lib list loop looping loops map math matrix memory multiple news node number numbertoword output parameter pointer problem program programming project proxy python random read recursion recursive reference rpg sorting string strings struct temperature template text text-file tree url variable vector video visual visualstudio win32 windows winsock word wordfrequency wxwidgets






