Parsing a CSV file separated by semicolons.

Please support our C++ advertiser: Intel Parallel Studio Home
Thread Solved

Join Date: Sep 2008
Posts: 24
Reputation: Yaserk88 is an unknown quantity at this point 
Solved Threads: 0
Yaserk88 Yaserk88 is offline Offline
Newbie Poster

Parsing a CSV file separated by semicolons.

 
0
  #1
Jul 17th, 2009
Hello! I am trying to open this CSV file separated by semicolons. I know how to open a text file and I tried searching the way to open a CSV, but most methods seemed extremly complicated.

Does any one have a simple suggestion that would work with what I already have.


#include <iostream>
#include <fstream>
#include <string>
#include <locale>
#include <sstream>

using namespace std;


int main()
{


	ifstream in(filename.csv);
	ofstream out(...............);

	int RID;
        int RID_2;
	

	

	string line;
	
	
	while( in >> RID >> RID_2)
	{
	 
	cout << RID << RID_2 << endl;

	}


	
return 0;
}
Reply With Quote Quick reply to this message  
Join Date: Oct 2006
Posts: 2,859
Reputation: niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute 
Solved Threads: 301
Moderator
Featured Poster
niek_e's Avatar
niek_e niek_e is online now Online
Roasting Maven

Re: Parsing a CSV file separated by semicolons.

 
0
  #2
Jul 17th, 2009
I'd read the file one line at a time and then parse this line to search for semicolons. Getline with a delimiter should do the trick. Here's a sample:

  1. #include <iostream>
  2. #include <sstream>
  3. #include <string>
  4. #include <fstream>
  5.  
  6. using namespace std;
  7.  
  8. int main(){
  9. ifstream infile("c:/in.txt"); // for example
  10. string line = "";
  11. while (getline(infile, line)){
  12. stringstream strstr(line);
  13. string word = "";
  14. while (getline(strstr,word, ';')) cout << word << '\n';
  15. }
  16. }
Last edited by niek_e; Jul 17th, 2009 at 10:26 am.
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,408
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1469
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Parsing a CSV file separated by semicolons.

 
0
  #3
Jul 17th, 2009
you could use getline() with the third parameter
  1. string word;
  2. while( getline(infile, word, ';' )
  3. {
  4. cout << word << "\n";
  5. }
Reply With Quote Quick reply to this message  
Join Date: Sep 2008
Posts: 24
Reputation: Yaserk88 is an unknown quantity at this point 
Solved Threads: 0
Yaserk88 Yaserk88 is offline Offline
Newbie Poster

Re: Parsing a CSV file separated by semicolons.

 
0
  #4
Jul 17th, 2009
Originally Posted by Ancient Dragon View Post
you could use getline() with the third parameter
  1. string word;
  2. while( getline(infile, word, ';' )
  3. {
  4. cout << word << "\n";
  5. }

Ancient Dragon, your solution was simple and helpful. The only problem is that I don't understand how to separate the string into the separate variables I wanted.

It seems that the strtok() function can be a viable method?

The code below is something I found on how to implement strtok() function. Only I do not quite understand all of the steps that are taken.

#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="- This, a sample string.";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}
Reply With Quote Quick reply to this message  
Join Date: Oct 2006
Posts: 2,859
Reputation: niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute 
Solved Threads: 301
Moderator
Featured Poster
niek_e's Avatar
niek_e niek_e is online now Online
Roasting Maven

Re: Parsing a CSV file separated by semicolons.

 
0
  #5
Jul 17th, 2009
Originally Posted by Yaserk88 View Post
The only problem is that I don't understand how to separate the string into the separate variables I wanted.
Why don't try the piece of code I posted earlier? It does just that.

Originally Posted by Yaserk88 View Post
It seems that the strtok() function can be a viable method?

The code below is something I found on how to implement strtok() function. Only I do not quite understand all of the steps that are taken.
The code you posted is C not C++. What language are you intending to use?
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,408
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1469
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Parsing a CSV file separated by semicolons.

 
0
  #6
Jul 17th, 2009
The problem you posted in #4 is not the same as the problem you originally posted in #1. Are they the same or two different problems? If different problems then you need to tell us that to avoid confusion.

What different variables do you want? If each column of the csv variable represents a string, then just use an array of strings
  1. string line;
  2. string arry[10];
  3. int i;
  4. while( getline(infile, line) ) // Oos! missed a )
  5. {
  6. stringstream str(line);
  7. for(i = 0; i < 10; i++)
  8. getline(str, arry[i], ';');
  9. }

The above might have problems if there are blank columns where two or more ; in a row, such as "one;;;two" If there are lines like that then it becomes much more complicated and you can't use getline() with that third parmeter.
Last edited by Ancient Dragon; Jul 17th, 2009 at 12:16 pm.
Reply With Quote Quick reply to this message  
Join Date: Oct 2006
Posts: 2,859
Reputation: niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute niek_e has a reputation beyond repute 
Solved Threads: 301
Moderator
Featured Poster
niek_e's Avatar
niek_e niek_e is online now Online
Roasting Maven

Re: Parsing a CSV file separated by semicolons.

 
0
  #7
Jul 17th, 2009
Originally Posted by Ancient Dragon View Post
T
The above might have problems if there are blank columns where two or more ; in a row, such as "one;;;two"
It will also stop at 10 words per line.
I would highly recommend using a vector in this case. I've made a small adjustment to the code I posted earlier:
  1. #include <iostream>
  2. #include <sstream>
  3. #include <string>
  4. #include <fstream>
  5. #include <vector>
  6.  
  7. using namespace std;
  8.  
  9. int main(){
  10. ifstream infile("c:/in.txt");
  11. string line = "";
  12. vector<string> all_words;
  13. while (getline(infile, line)){
  14. stringstream strstr(line);
  15. string word = "";
  16. while (getline(strstr,word, ';')) all_words.push_back(word);
  17. }
  18. }

After you run this code, all the words will be in the vector. To show the vector use something like:
  1. for (unsigned i = 0; i < all_words.size(); i++)
  2. cout << all_words.at(i) << '\n';
Note that all this code is untested, so it might have a bug or two that I've missed.
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,408
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1469
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Parsing a CSV file separated by semicolons.

 
0
  #8
Jul 17th, 2009
Originally Posted by niek_e View Post
It will also stop at 10 words per line.
I would highly recommend using a vector in this case.
yes, that is a better solution. But, like mine, it doesn't work when there are two or more adjacient semicolons (or some other column separator).

After testing, strtok() doesn't work right either because it also skips adjacent semicolons.
Last edited by Ancient Dragon; Jul 17th, 2009 at 12:11 pm.
Reply With Quote Quick reply to this message  
Join Date: Sep 2008
Posts: 24
Reputation: Yaserk88 is an unknown quantity at this point 
Solved Threads: 0
Yaserk88 Yaserk88 is offline Offline
Newbie Poster

Re: Parsing a CSV file separated by semicolons.

 
0
  #9
Jul 21st, 2009
Originally Posted by niek_e View Post
It will also stop at 10 words per line.
I would highly recommend using a vector in this case. I've made a small adjustment to the code I posted earlier:
  1. #include <iostream>
  2. #include <sstream>
  3. #include <string>
  4. #include <fstream>
  5. #include <vector>
  6.  
  7. using namespace std;
  8.  
  9. int main(){
  10. ifstream infile("c:/in.txt");
  11. string line = "";
  12. vector<string> all_words;
  13. while (getline(infile, line)){
  14. stringstream strstr(line);
  15. string word = "";
  16. while (getline(strstr,word, ';')) all_words.push_back(word);
  17. }
  18. }

After you run this code, all the words will be in the vector. To show the vector use something like:
  1. for (unsigned i = 0; i < all_words.size(); i++)
  2. cout << all_words.at(i) << '\n';
Note that all this code is untested, so it might have a bug or two that I've missed.

Sorry it has taken me a while to come back to this problem, but I have been away for the weekend. I have decided to go with this solution here because Ancient Dragon also seems to agree that it is better.

As far as the empty spaces and adjacent semicolons go, I did have some, but I redesigned my file to have the empty slots filled with "-1".

I have attached my file, so you can see the way it looks. I am trying to read the file and be able to call the numbers as numbers and the words as strings.

So here is the way things look right now. I use "infile.imbue(locale("german_germany.1252"));" to try and have the program read the commas in as decimal points rather than commas. This worked when I was reading the file the following way:

while( infile >> Var1 >> Var2 ...........), but with using getline, I cannot get this to work. Any suggestions?


#include <iostream>
#include <sstream>
#include <string>
#include <fstream>
#include <vector>
#include <locale>

using namespace std;

int main()
{
    ifstream infile("C:\\Dokumente und Einstellungen\\Yaser\\Eigene Dateien\\Internship\\C++_Code\\All_Data.csv");
	ofstream outfile("C:\\Dokumente und Einstellungen\\Yaser\\Eigene Dateien\\Internship\\C++_Code\\Output.txt");
   
	infile.imbue(locale("german_germany.1252"));

	string line = "";
    vector<string> all_words;

    while (getline(infile, line))
	{
        stringstream strstr(line);
        string word = "";
        while (getline(strstr,word, ';')) all_words.push_back(word);
    }
		infile.imbue(locale("german_germany.1252"));
	for (unsigned i = 0; i < all_words.size(); i++)
        outfile << all_words.at(i) << "\t" << endl;
}

Is there anyway to set the file up as a two-dimensional array instead? so that I can have "all_words[....][....]"

Your help is very well appreciated.
Attached Files
File Type: txt Sample1.txt (7.5 KB, 2 views)
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,408
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1469
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Parsing a CSV file separated by semicolons.

 
0
  #10
Jul 21st, 2009
The only problem with your program is that you need to set local for the stringstream object.
  1. while (getline(infile, line))
  2. {
  3. stringstream strstr(line);
  4. string word = "";
  5. strstr.imbue(locale("german_germany.1252"));
  6. while (getline(strstr,word, ';'))
  7. all_words.push_back(word);
  8. }

Attached is the output file I got.
Last edited by Ancient Dragon; Jul 21st, 2009 at 10:25 am.
Attached Files
File Type: txt Output.txt (9.2 KB, 2 views)
Reply With Quote Quick reply to this message  
Reply

This thread has been marked solved.
Perhaps start a new thread instead?
Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC