Hi I have written a program in c++ and I feel that it is a bit slow for very large datasets like > 1GB. Well I am not that confident in char* since I have just started coding in c++. So can anybody help me to make it more efficient and suggest by example snippets the changes that i need to make. To make it more efficient?

Also in this code I am pre fixing the number of fields in my tab separated file. Like here I have 14 fields in actual file and I am hard coding it. Now if a user gives any file with any number of fields how can I make it work in that case.

Here is my program

#include<iostream>
#include<fstream>
#include<string>
#include<sstream>
#include<map>
using namespace std;

int main(int argc, char* argv[]) {

     multimap<string, string> mm;
     string str[15];
     string str1;
     string str2;
     string combined;

     ................

     ifstream myfile(argv[1]);

     while(! myfile.eof()){    
     for(int i =0; i < 15; i++)
     if (i<14)getline(myfile, str[i], '\t');
     else getline(myfile, str[i]);
     str1.assign(str[1]);
     str2.assign(str[2]);
     str3.assign(str[3]);
     str4.assign(str[4]);
     .............................

       int pos1 = str6.find( "words" );
     if ( pos1 != string::npos )
      str6.replace( pos1, 5, "" );
      pos1 = str6.find( "words", pos1 + 1 );

     combined = str2+"\t"+str3+"\t"+str4+"\t"+str6;

          if((str9.compare(argv[2])==0)){
               mm.insert(pair<string, string>(str5, combined));
          }
     }


       for (multimap<string, string>::iterator it = mm.begin();it != mm.end();++it)
       {
            cout << (*it).second << "\t" << (*it).first << endl;
       }

     mm.clear();
     return 0;
}

Thanks

regarding increasing efficiency:

Every time u invoke getline() method to read something from the file.
If the file have say 10,00,000 words u are executing getline() that many time. Now getline() involves file-read operation which is a time consuming task and hence it decreases the efficiency.

What u can do instead is read a block of data into a string (of say 1,000 character) and do whatever processing u need to do with the words in it and when u are finished with that block read the next block.
U will see the improvement in efficiency in this way.

Thanks you have been quite helpful :)

and plz mark the thread as solved if u feel u are answered so that others dont waste their time in a solved thread. :)

This question has already been answered. Start a new discussion instead.