954,496 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

improve performance of the following io codes

can anyone offer performance tips to improve the running time ?
this function opens a file (a 7000 row by 30 col), and stores each elements in a matrix data . current running time is 4 sec, and i desperately need to minimize the running time as i need to iterate thru
thousands of such files please help, thanks

void iocsv(vector<vector<double> >& data, string path)
{
     string s;
     ifstream inFile;
     
     inFile.open(path.c_str());
     if (inFile) {
         while (getline(inFile, s)) {
               vector<double> col;
               tokenizer<escaped_list_separator<char> > tok(s);
                    for (tokenizer<escaped_list_separator<char> >::iterator beg=tok.begin(); beg!=tok.end(); ++beg) {
                        istringstream price;
                        price.str(*beg);
                        double x;
                        price >> x;
                        col.push_back(x);
                    }
               data.push_back(col);
         }
     }   else { cerr << "Warning: cannot open file "  << path << endl;
                cerr << "Program terminating ......" << endl; }
            
     inFile.close();
}
kimw
Newbie Poster
22 posts since Oct 2005
Reputation Points: 44
Solved Threads: 0
 

can anyone offer performance tips to improve the running time ? this function opens a file (a 7000 row by 30 col), and stores each elements in a matrix data . current running time is 4 sec, and i desperately need to minimize the running time as i need to iterate thru thousands of such files please help, thanks

void iocsv(vector<vector<double> >& data, string path)
{
     string s;
     ifstream inFile;
     
     inFile.open(path.c_str());
     if (inFile) {
         while (getline(inFile, s)) {
               vector<double> col;
               tokenizer<escaped_list_separator<char> > tok(s);
                    for (tokenizer<escaped_list_separator<char> >::iterator beg=tok.begin(); beg!=tok.end(); ++beg) {
                        istringstream price;
                        price.str(*beg);
                        double x;
                        price >> x;
                        col.push_back(x);
                    }
               data.push_back(col);
         }
     }   else { cerr << "Warning: cannot open file "  << path << endl;
                cerr << "Program terminating ......" << endl; }
            
     inFile.close();
}

Are the files sorted? The reason why I ask is because although the inital sorting costs a lot in terms of time, once sorted it would be a lot quicker to retrieve data.

iamthwee
Posting Expert
5,950 posts since Aug 2005
Reputation Points: 1,543
Solved Threads: 439
 

hi the file is not sorted ... although i could build a macro to sort all the files (they are all in csv format) but i wish to see what other alternatives there are

kimw
Newbie Poster
22 posts since Oct 2005
Reputation Points: 44
Solved Threads: 0
 

How long does an empty loop take?

while (getline(inFile, s)) {
         }


Separate the "time to read the file" from the time to "tokenise the file".

It it takes <1 second, then there might be something you can do.

If it takes >3 seconds, then all your tokenising/vector stuff is not the problem.

> and i desperately need to minimize the running time as i need to iterate thru
> thousands of such files
Or just not worry about it and let the program run overnight, and it will all be done by morning anyway. If that can be done, it certainly isn't worth spending more than a day trying to make it vastly more efficient.
By your measure, it's about 900 files per hour.

Salem
Posting Sage
Team Colleague
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953
 

it was good suggestion, well i've tested the emply loop and it took < 1 sec so it must be all the tokenizer and vector stuffs ... am i going overboard by using tokenizer, since i basically wanted to store all the elements in csv file into a matrix

kimw
Newbie Poster
22 posts since Oct 2005
Reputation Points: 44
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You