Display only the 20 most used words[Word Frequency]

Please support our C++ advertiser: Intel Parallel Studio Home
Thread Solved

Join Date: Sep 2009
Posts: 10
Reputation: markrezak has a little shameless behaviour in the past 
Solved Threads: 0
markrezak markrezak is offline Offline
Newbie Poster

Display only the 20 most used words[Word Frequency]

 
0
  #1
Sep 23rd, 2009
  1. ok ok i got it now right i copy it to vector and sort it..... here is the code
  2.  
  3.  
  4. #include <iostream>
  5. #include <fstream>
  6. #include <algorithm>
  7. #include <string>
  8. #include <map>
  9. #include <vector>
  10.  
  11. using namespace std;
  12.  
  13. typedef map<string,int> word_count_list;
  14.  
  15. struct val_lessthan : binary_function < pair<string,int>, pair<string,int>, bool >
  16. {
  17. bool operator() (const pair<string,int>& x, const pair<string,int>& y) const
  18. {return x.second<y.second;}
  19. }val_lt;
  20.  
  21. int main()
  22. {
  23. word_count_list word_count;
  24. string filename;
  25.  
  26. // Get the filename.
  27. cout << "Enter the file you wish to have searched:\n";
  28. cin >> filename;
  29.  
  30. // Open file.
  31. ifstream file(filename.c_str());
  32.  
  33. // Read in all the words.
  34. string word;
  35.  
  36. while (file >> word){
  37. // Remove punctuation.
  38. int index;
  39. while ((index = word.find_first_of(".,!?\\;-*+")) != string::npos)
  40. word.erase(index, 1);
  41.  
  42. ++word_count[word];
  43. }
  44.  
  45. //copy pairs to vector
  46. vector<pair<string,int> > wordvector;
  47. copy(word_count.begin(), word_count.end(), back_inserter(wordvector));
  48.  
  49. //sort the vector by second (value) instead of key
  50. sort(wordvector.begin(), wordvector.end(), val_lt);
  51.  
  52. for(int i=0; i<wordvector.size(); ++i)
  53. cout << wordvector[i].first << " = " << wordvector[i].second << endl;
  54.  
  55. return 0;
  56. }
  57.  
  58. to finish the porgram. i need to output the 20 most common or used words....
  59.  
  60. i try changing
  61. for(int i=0; i<wordvector.size(); ++i)
  62. to for9int i=0,i<=20;i++)
  63.  
  64. but nothing happens
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 10
Reputation: markrezak has a little shameless behaviour in the past 
Solved Threads: 0
markrezak markrezak is offline Offline
Newbie Poster

Re: Display only the 20 most used words[Word Frequency]

 
0
  #2
Sep 24th, 2009
cant get it right hhmmp
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,412
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1469
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Display only the 20 most used words[Word Frequency]

 
-7
  #3
Sep 24th, 2009
This should do it
  1. vector<pair<string,int> >::iterator it;
  2.  
  3. for(it = wordvector.begin(); it != wordvector.end(); it++)
  4. cout << (*it).first << " = " << (*it).second << "\n";
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 10
Reputation: markrezak has a little shameless behaviour in the past 
Solved Threads: 0
markrezak markrezak is offline Offline
Newbie Poster

Re: Display only the 20 most used words[Word Frequency]

 
0
  #4
Sep 24th, 2009
i try using
vector<pair<string,int> >::iterator it;

for(it = wordvector.begin(); it != wordvector.end(); it++)
cout << (*it).first << " = " << (*it).second << "\n";
this does not limit the output to 20

am i missing something?

please teach me. this is the last problem to complete the program

just display the 20 most common words.


damn why cant i use the CODE button
Last edited by markrezak; Sep 24th, 2009 at 10:29 am.
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,412
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1469
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Display only the 20 most used words[Word Frequency]

 
-7
  #5
Sep 24th, 2009
>>am i missing something?
Just add a counter, when it reaches 20 then stop the loop. Also, pay attention to the quantities that are displayed because the smallest ones are sorted to the top, not to the bottom. So the first 20 items in that array will be the smallest, not the largest. To fix that you will need to change the sort callback function on line 18 of your original post to use > operator instead of <.

>>damn why cant i use the CODE button
I don't use a button, just add code tags manually

[code]
// put your code here
[/code]


NEW DANIWEB FEATURE Look at the top of the Quick Edit window, next to the other buttons, and you will see [code] button. Just click it and paste your code between the two tags as I showed above.
Last edited by Ancient Dragon; Sep 24th, 2009 at 11:01 am.
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 10
Reputation: markrezak has a little shameless behaviour in the past 
Solved Threads: 0
markrezak markrezak is offline Offline
Newbie Poster

Re: Display only the 20 most used words[Word Frequency]

 
0
  #6
Sep 24th, 2009
sir ancient dragon

i come up with this
  1. #include <iostream>
  2. #include <fstream>
  3. #include <algorithm>
  4. #include <string>
  5. #include <map>
  6. #include <vector>
  7.  
  8. using namespace std;
  9.  
  10. typedef map<string,int> word_count_list;
  11.  
  12. struct val_lessthan : binary_function < pair<string,int>, pair<string,int>, bool >
  13. {
  14. bool operator() (const pair<string,int>& x, const pair<string,int>& y) const
  15. {return y.second<x.second;}
  16. }val_lt;
  17.  
  18. int main()
  19. {
  20.  
  21. word_count_list word_count;
  22. string filename;
  23.  
  24. // Get the filename.
  25. cout << "Enter the file you wish to have searched:\n";
  26. cin >> filename;
  27.  
  28. // Open file.
  29. ifstream file(filename.c_str());
  30.  
  31. // Read in all the words.
  32. string word;
  33.  
  34. while (file >> word){
  35. // Remove punctuation.
  36. int index;
  37. while ((index = word.find_first_of(".,!?\\;-*+")) != string::npos)
  38. word.erase(index, 1);
  39.  
  40. ++word_count[word];
  41. }
  42.  
  43. //copy pairs to vector
  44. vector<pair<string,int> > wordvector;
  45. vector<pair<string,int> >::iterator it;
  46. copy(word_count.begin(), word_count.end(), back_inserter(wordvector));
  47.  
  48. //sort the vector by second (value) instead of key
  49. sort(wordvector.begin(), wordvector.end(), val_lt);
  50. int i ;
  51. {
  52. for(it = wordvector.begin(),i=1; it!=wordvector.end(),i<21; ++it,++i)
  53. cout<<i<<(*it).first << " = " << (*it).second << "\n";
  54. if(i=20)
  55. {
  56.  
  57. system("pause");
  58. }
  59. }
  60. }
Last edited by markrezak; Sep 24th, 2009 at 1:56 pm.
Reply With Quote Quick reply to this message  
Reply

Tags
looping, map, vector, word, wordfrequency

This thread has been marked solved.
Perhaps start a new thread instead?
Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC