1.11M Members

Parsing a CSV file separated by semicolons.

 
0
 

Hello! I am trying to open this CSV file separated by semicolons. I know how to open a text file and I tried searching the way to open a CSV, but most methods seemed extremly complicated.

Does any one have a simple suggestion that would work with what I already have.

#include <iostream>
#include <fstream>
#include <string>
#include <locale>
#include <sstream>

using namespace std;


int main()
{


	ifstream in(filename.csv);
	ofstream out(...............);

	int RID;
        int RID_2;
	

	

	string line;
	
	
	while( in >> RID >> RID_2)
	{
	 
	cout << RID << RID_2 << endl;

	}


	
return 0;
}
 
0
 

I'd read the file one line at a time and then parse this line to search for semicolons. Getline with a delimiter should do the trick. Here's a sample:

#include <iostream>
#include <sstream>
#include <string>
#include <fstream>

using namespace std;

int main(){
    ifstream infile("c:/in.txt"); // for example
    string line = "";
    while (getline(infile, line)){
        stringstream strstr(line);
        string word = "";
        while (getline(strstr,word, ';')) cout << word << '\n';
    }
}
 
0
 

you could use getline() with the third parameter

string word;
while( getline(infile, word, ';' )
{
    cout << word << "\n";
}
 
0
 

you could use getline() with the third parameter

string word;
while( getline(infile, word, ';' )
{
    cout << word << "\n";
}

Ancient Dragon, your solution was simple and helpful. The only problem is that I don't understand how to separate the string into the separate variables I wanted.

It seems that the strtok() function can be a viable method?

The code below is something I found on how to implement strtok() function. Only I do not quite understand all of the steps that are taken.

#include <stdio.h>
#include <string.h>

int main ()
{
  char str[] ="- This, a sample string.";
  char * pch;
  printf ("Splitting string \"%s\" into tokens:\n",str);
  pch = strtok (str," ,.-");
  while (pch != NULL)
  {
    printf ("%s\n",pch);
    pch = strtok (NULL, " ,.-");
  }
  return 0;
}
 
0
 

The only problem is that I don't understand how to separate the string into the separate variables I wanted.

Why don't try the piece of code I posted earlier? It does just that.

It seems that the strtok() function can be a viable method?

The code below is something I found on how to implement strtok() function. Only I do not quite understand all of the steps that are taken.

The code you posted is C not C++. What language are you intending to use?

 
0
 

The problem you posted in #4 is not the same as the problem you originally posted in #1. Are they the same or two different problems? If different problems then you need to tell us that to avoid confusion.

What different variables do you want? If each column of the csv variable represents a string, then just use an array of strings

string line;
string arry[10];
int i;
while( getline(infile, line) ) // Oos! missed a )
{
      stringstream str(line);
      for(i = 0; i < 10; i++)
           getline(str, arry[i], ';');
}

The above might have problems if there are blank columns where two or more ; in a row, such as "one;;;two" If there are lines like that then it becomes much more complicated and you can't use getline() with that third parmeter.

 
0
 

T
The above might have problems if there are blank columns where two or more ; in a row, such as "one;;;two"

It will also stop at 10 words per line.
I would highly recommend using a vector in this case. I've made a small adjustment to the code I posted earlier:

#include <iostream>
#include <sstream>
#include <string>
#include <fstream>
#include <vector>

using namespace std;

int main(){
    ifstream infile("c:/in.txt");
    string line = "";
    vector<string> all_words;
    while (getline(infile, line)){
        stringstream strstr(line);
        string word = "";
        while (getline(strstr,word, ';')) all_words.push_back(word);
    }
}

After you run this code, all the words will be in the vector. To show the vector use something like:

for (unsigned i = 0; i < all_words.size(); i++)
        cout << all_words.at(i) << '\n';

Note that all this code is untested, so it might have a bug or two that I've missed.

 
0
 

It will also stop at 10 words per line.
I would highly recommend using a vector in this case.

yes, that is a better solution. But, like mine, it doesn't work when there are two or more adjacient semicolons (or some other column separator).

After testing, strtok() doesn't work right either because it also skips adjacent semicolons.

 
0
 

It will also stop at 10 words per line.
I would highly recommend using a vector in this case. I've made a small adjustment to the code I posted earlier:

#include <iostream>
#include <sstream>
#include <string>
#include <fstream>
#include <vector>

using namespace std;

int main(){
    ifstream infile("c:/in.txt");
    string line = "";
    vector<string> all_words;
    while (getline(infile, line)){
        stringstream strstr(line);
        string word = "";
        while (getline(strstr,word, ';')) all_words.push_back(word);
    }
}

After you run this code, all the words will be in the vector. To show the vector use something like:

for (unsigned i = 0; i < all_words.size(); i++)
        cout << all_words.at(i) << '\n';

Note that all this code is untested, so it might have a bug or two that I've missed.

Sorry it has taken me a while to come back to this problem, but I have been away for the weekend. I have decided to go with this solution here because Ancient Dragon also seems to agree that it is better.

As far as the empty spaces and adjacent semicolons go, I did have some, but I redesigned my file to have the empty slots filled with "-1".

I have attached my file, so you can see the way it looks. I am trying to read the file and be able to call the numbers as numbers and the words as strings.

So here is the way things look right now. I use "infile.imbue(locale("german_germany.1252"));" to try and have the program read the commas in as decimal points rather than commas. This worked when I was reading the file the following way:

while( infile >> Var1 >> Var2 ...........), but with using getline, I cannot get this to work. Any suggestions?

#include <iostream>
#include <sstream>
#include <string>
#include <fstream>
#include <vector>
#include <locale>

using namespace std;

int main()
{
    ifstream infile("C:\\Dokumente und Einstellungen\\Yaser\\Eigene Dateien\\Internship\\C++_Code\\All_Data.csv");
	ofstream outfile("C:\\Dokumente und Einstellungen\\Yaser\\Eigene Dateien\\Internship\\C++_Code\\Output.txt");
   
	infile.imbue(locale("german_germany.1252"));

	string line = "";
    vector<string> all_words;

    while (getline(infile, line))
	{
        stringstream strstr(line);
        string word = "";
        while (getline(strstr,word, ';')) all_words.push_back(word);
    }
		infile.imbue(locale("german_germany.1252"));
	for (unsigned i = 0; i < all_words.size(); i++)
        outfile << all_words.at(i) << "\t" << endl;
}

Is there anyway to set the file up as a two-dimensional array instead? so that I can have "all_words[....][....]"

Your help is very well appreciated.

Attachments Sample1.txt (7.48KB)
 
0
 

The only problem with your program is that you need to set local for the stringstream object.

while (getline(infile, line))
	{
        stringstream strstr(line);
        string word = "";
	    strstr.imbue(locale("german_germany.1252"));
        while (getline(strstr,word, ';')) 
            all_words.push_back(word);
    }

Attached is the output file I got.

Attachments Output.txt (9.17KB)
 
4
 

Sorry to respond to this late... but I wanted to post info also...

The getline() function has the obnoxious habit of returning a not good() stream for final blank fields...

For a single blank line at the end of input, that's fine... (there's no record) but for blank fields it makes a difference. You can get past the problem by checking the stream state before getting a line.

For simple CSV files (meaning you cannot use the ';' character [or whatever character you've chosen] in the field value) this is a working example:

#include <deque>
#include <iostream>
#include <sstream>
#include <string>

typedef std::deque <std::string> record_t;
typedef std::deque <record_t>    table_t;

std::istream& operator >> ( std::istream& ins, table_t& table )
  {
  std::string s;
  table.clear();

  while (std::getline( ins, s ))
    {
    std::istringstream ss( s );
    record_t           record;
    std::string        field;
    bool               final = true;

    while (std::getline( ss, field, ';' ))
      {
      record.push_back( field );
      final = ss.eof();
      }
    if (!final)
      record.push_back( std::string() );

    table.push_back( record );
    }

  return ins;
  }

This will allow you to read all seven fields in a record like:

one; two;three;four;;six;

Hope this helps.

 
0
 

The only problem with your program is that you need to set local for the stringstream object.

while (getline(infile, line))
	{
        stringstream strstr(line);
        string word = "";
	    strstr.imbue(locale("german_germany.1252"));
        while (getline(strstr,word, ';')) 
            all_words.push_back(word);
    }

Attached is the output file I got.

Hi Ancient Dragon. I see that this makes sense, but the commas in the numbers are not being replaced with decimals. I think the reason for this is that things are still in strings and I need to convert them to integer and float values.

For instances when I do all_words[0] + all_words[2] (all_words[0]=2 and all_words[2]=31), I get 231.

How do I convert each indviudal value of my file so that I can specify it either as int, float, or string?

 
0
 

I will actually mark this thread as solved and post it as a different problem, because it may be useful for other people. Thanks for the help everyone!

Question Answered as of 5 Years Ago by Ancient Dragon, Nick Evan and Duoas
 
0
 

How do I convert each indviudal value of my file so that I can specify it either as int, float, or string?

Simple substitution -- call find() to locate the comma and replace it with a period.

You
This question has already been solved: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article