I am writing a simple program that reads in a plain text file (ignoring numbers/symbols) and stores each unique word and the frequency it appears in the text file. This information is stored in a multimap<string, int>. I want to sort the elements in the multimap so as to facilitate efficient searching, meaning I want the words that have the highest associated frequency to be the first element in the multimap.

I'm not sure how exactly to go about doing this. Should it be done while the program is parsing in the text file? Or what would the best way to manipulate the structure. I'm thinking that maybe copying the elements one by one into another map might be the best way to go this, but I'm not sure.

Any comments are appreciated

Recommended Answers

All 5 Replies

> stores each unique word and the frequency it appears in the text file.
> This information is stored in a multimap<string, int>
it would be simpler to use a std::map<std::string, int> (instead of a std::multimap)

std::map<std::string, int> word_count ;
std::string word ;
while( read_next_word( file, word ) ) ++word_count[word] ;

> I want to sort the elements in the multimap so as to facilitate efficient searching,
> meaning I want the words that have the highest associated frequency to be the first
> element in the multimap.
insert the key-value pairs from the map into a
std::multimap< int, std::string, std::greater<int> >

std::multimap< int, std::string, std::greater<int> > mmap ;
  typedef std::map< std::string, int >::iterator iterator ;
  for( iterator iter = word_count.begin() ; 
           iter != word_count.end() ; ++iter )
    mmap.insert( std::make_pair( iter->second, iter->first ) ) ;

Thanks for the help. I did what you suggested and converted my code to initially parse as a map<string, int> and when sorted it puts everything into multimap<int, string, greater<int> >. This all works great, but the only problem, is that I'm not exactly sure what makes the second piece of code work? what does the greater<int> key in the multi map do? And what makes it important to include? Stupid questions, I know, but if I don't understand the code it doesnt help me.

Thanks again

EDIT: I just noticed that this code also sorts in alphabetical order (secondary to frequency). If you could explain how it does this as well I would greatly appreciate it.

> what does the greater<int> key in the multi map do?
the third template parameter of a multimap<> is the type that performs comparisons between keys. a (function) object of this kind takes two arguments of the key type and returns a bool. The expression comp(a,b), where comp is an object of this comparison class and a and b are key values, shall return true if a is to be placed at an earlier position than b in a strict weak ordering operation.

> And what makes it important to include?
only because you said 'I want the words that have the highest associated frequency to be the first element in the multimap'. the default for this parameter is std::less<key_type>; this would place smaller keys at the beginning of the sequence. what you wanted is to have the larger keys at the beginning of the sequence

> I just noticed that this code also sorts in alphabetical order (secondary to frequency).
the reason is that the map<std::string,int> uses the default std::less<std::string> as the key comparison predicate; this would place the keys in the map in alphabetical order. the implementation of the multimap that you are using uses the order of insertion to place elements if the keys compare equal.

all this works iff the map/multimap uses some kind a search tree and iterations perform an in-order traversal of the tree. this is the usual implementation.

Thank you for the info, I have one more question and I think I might have this understood. In my program I implemented a function to search for a word and see if it is both in it the text file that is read in and, if it is, to show how many times it shows up in the text document. I did this, and the searching part works great.

Now I want to put the items that were searched for into the front of the multimap. my code currently takes in a map of the terms that were searched for and the multimap<int, string, greater<int> > that has the all the words in the text document. I have tried to simply create a new multimap<int, string, greater<int> and first insert the searched items, and then afterwords inserting the contents of the original multimap into the new multimap, but the greater comparison parameter still places the searched words into order of frequency.

That probably sounds confusing but I want the list to display in the order:
'searchedword1'
'searchedword2'
'highestfreqword'
and so on.

Again, thanks for your time.

> Now I want to put the items that were searched for into the front of the multimap.
this is not possible (unless you perform tricks with the key). maps and multimaps are associative containers; they associate keys and values. fast lookup on the key what they are designed for.
you could use a simple sequence container like std::list< std::pair< int, std::string > > if you want to determine the order in the sequence arbitrarily.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.