View Single Post
Join Date: Jul 2008
Posts: 2,001
Reputation: ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of 
Solved Threads: 343
ArkM's Avatar
ArkM ArkM is offline Offline
Postaholic

Re: Determining the number of unique words in a .txt file

 
0
  #4
Dec 3rd, 2008
You have two different problems:
1. Tokenize the input stream (extract words from the stream).
2. Build word dictionary.
The second one has a very simple solution: use std::<map> data structure as Laiq Ahmed mentioned above. However you need another code to process every word with the map-based word dictionary:
  1. if the next word is found in the map then
  2. do notning or increment this word counter
  3. else
  4. insert new word in the map
  5. endif
There are lots of methods to solve the 1st problem. For example:
  1. open file stream
  2. create an empty map
  3. loop // until eof
  4. skip non-letters
  5. clear word buffer // use std::string::clear()
  6. append letters to the word buffer
  7. process the word with the map // see #2
  8. endloop
  9. process a possible last word
  10. traverse map (file word dictionary)
Summary:
- use std::ifstream
- use std::string for word buffer
- use std::map<std::string,int> for dictionary with counters
You have a good chance to write a simple and clear code after a proper functional decomposition of pseudocode snippets ...
Reply With Quote