Hi I have written a program in c++ and I feel that it is a bit slow for very large datasets like > 1GB. Well I am not that confident in char* since I have just started coding in c++. So can anybody help me to make it more efficient and suggest by example snippets the changes that i need to make. To make it more efficient?

Also in this code I am pre fixing the number of fields in my tab separated file. Like here I have 14 fields in actual file and I am hard coding it. Now if a user gives any file with any number of fields how can I make it work in that case.

Here is my program

using namespace std;

int main(int argc, char* argv[]) {

     multimap<string, string> mm;
     string str[15];
     string str1;
     string str2;
     string combined;


     ifstream myfile(argv[1]);

     while(! myfile.eof()){    
     for(int i =0; i < 15; i++)
     if (i<14)getline(myfile, str[i], '\t');
     else getline(myfile, str[i]);

       int pos1 = str6.find( "words" );
     if ( pos1 != string::npos )
      str6.replace( pos1, 5, "" );
      pos1 = str6.find( "words", pos1 + 1 );

     combined = str2+"\t"+str3+"\t"+str4+"\t"+str6;

               mm.insert(pair<string, string>(str5, combined));

       for (multimap<string, string>::iterator it = mm.begin();it != mm.end();++it)
            cout << (*it).second << "\t" << (*it).first << endl;

     return 0;


8 Years
Discussion Span
Last Post by dkalita

regarding increasing efficiency:

Every time u invoke getline() method to read something from the file.
If the file have say 10,00,000 words u are executing getline() that many time. Now getline() involves file-read operation which is a time consuming task and hence it decreases the efficiency.

What u can do instead is read a block of data into a string (of say 1,000 character) and do whatever processing u need to do with the words in it and when u are finished with that block read the next block.
U will see the improvement in efficiency in this way.


Thanks you have been quite helpful :)

and plz mark the thread as solved if u feel u are answered so that others dont waste their time in a solved thread. :)

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.