hi, this should be a fairly simple question to answer. I have a function in my program that takes the content of a text file and populates an array with all the information. First it has to see how many lines the file has (as it can change) so it can create a dynamic array based on the number of lines.

if(file.is_open() && (get_file_size() > 0)) //get_file_size checks file size
        {
                while(!file.eof() )
                {
                        getline(file,temp);
                        x++; //number of lines
                }
        }

That part works fine, it's this next part that gives me the trouble.

while(!file.eof() )
        {
                end = 0;
                getline(file,temp);
                for(int z=0;z<2;z++) //hardcoded for number of fields to get
                {
                        end = temp.find(" ");
                        data[y][z] = temp.substr(0,end);
                        temp = temp.substr(end+1);
                }
                y++;
        }

I realize that once it ran through the file getting the number of lines that it ran to the end of the file, so I tried placing this just before the second part to remedy that:

file.seekg (0, ios::beg);

however, it doesn't seem to work, and I'm confused as to why.

I should mention that when I do file.tellg() right before the second part, it says -1 whether I have the seekg there or not.

to make sure seekg was the problem, I tried closing the file just before the second part and reopening the same file under file2, worked great. So I have a working solution, but I know there must be a better, cleaner way. Thanks!

~J

Recommended Answers

All 16 Replies

the file is not in a good state after your loop executes file.eof() == true clear the eofbit first.

file.clear() ;
  file.seekg( 0, std::ios::beg ) ;

the seekg you are doing is ok because you are seeking to the beginning of the file. for a seekg to some non-zero offset (or a seekg from the end) to work reliably (portably), the file has to be opened in binary mode.

Yeah I agree with vijayan. File pointers like seekg(),tellg() work correctly only with Binary Files where the read() and write() functions are used.

hmm... I got it to work properly with the clear() function, thanks vijayan, I was unaware of good and bad file state flags (first time dabbling in file manipulation). But krnekgelesh's addition confuses me. my file is not being opened in binary mode:

ifstream file("user_data.txt");

yet it works? I also found an example at cplusplus.com that uses the file pointers on a .txt file (in addition to a binary file), here's their example:

// obtaining file size
#include <iostream>
#include <fstream>
using namespace std;

int main () {
  long begin,end;
  ifstream myfile ("example.txt");
  begin = myfile.tellg();
  myfile.seekg (0, ios::end);
  end = myfile.tellg();
  myfile.close();
  cout << "size is: " << (end-begin) << " bytes.\n";
  return 0;
}

complete page here: http://www.cplusplus.com/doc/tutorial/files.html

> I also found an example at cplusplus.com that uses the file pointers on a .txt file (in addition to a binary file), here's their example ...
AARGH!
1. stream::tellg() returns an streampos, which isn't guaranteed to be convertible to an integral type
2. even if streampos is convertible to an integral type, there's no guarantee that the numeric value of this type has any real significance (eg. high order bits contain the sector number, low order bits the offset in the sector).
3. even if it represents some numerical offset within the system, it isn't guaranteed to fit in a long.
4. subtracting one streampos from another gives a streamoff. streamoff is some integral type, but will not necessarily fit in a long.

even if streampos and streamoff are long values, there would be problems when multi-byte encodings are involved.
even if the locale imbued by the stream is the classic "C" locale with single byte encoding, there would be problems because of escape sequence translations during input/output.

all that seekg on an fstream does is call filebuf::seekpos or filebuf::seekoff. and the standard says:
"If the position has not been obtained by a previous successful call to one of the positioning functions (seekoff or seekpos) on the same file the effects are undefined."

try the following on a windows machine (with single-byte ascii encoding for the default locale):

#include <fstream>
#include <iostream>
#include <string>
int main()
{
  std::string line = "this is one line\n" ;
  enum { NLINES = 1024 };
  {
      std::ofstream file( "test.txt" ) ; // open in text mode
      for( int i=0 ; i<NLINES ; ++i ) file << line ;
      std::cout << "we have written " << NLINES * line.size() << " chars\n" ;
  }
  std::ifstream file( "test.txt" ) ; // open in text mode
  file.seekg( 0, std::ios::beg ) ;
  const std::streampos beginpos = file.tellg() ;
  file.seekg( 0, std::ios::end ) ;
  const std::streampos endpos = file.tellg() ;
  const std::streamoff nchars = endpos - beginpos ;
  std::cout << "cplusplus.com would say file has " << nchars << " chars\n" ;

  file.seekg( 0, std::ios::beg ) ;
  char ch ;
  file >> std::noskipws ;
  int count = 0 ;
  while( file >> ch ) ++count ;
  std::cout << "file really has " << count << " chars\n" ;

}

the output would be something like this:

we have written 17408 chars
cplusplus.com  would say file has 18432 chars
file really has 17408 chars

if you want to find the number of chars in a text file, c++ offers no reliable and portable solution other than reading the file char by char from beginning to end.

hi, this should be a fairly simple question to answer. I have a function in my program that takes the content of a text file and populates an array with all the information. First it has to see how many lines the file has (as it can change) so it can create a dynamic array based on the number of lines.

if(file.is_open() && (get_file_size() > 0)) //get_file_size checks file size
        {
                while(!file.eof() )
                {
                        getline(file,temp);
                        x++; //number of lines
                }
        }

That part works fine, it's this next part that gives me the trouble.

No it does not work fine. Actually it may probably results in the wrong value of x because eof() doesn't work the way you think it should. Here is the correct way to code that loop

while( getline(file, temp) )
    ++x; // number of lines
while(!file.eof() )
        {
                end = 0;
                getline(file,temp);
                for(int z=0;z<2;z++) //hardcoded for number of fields to get
                {
                        end = temp.find(" ");
                        data[y][z] = temp.substr(0,end);
                        temp = temp.substr(end+1);
                }
                y++;
        }

The above code has the same problem with eof().

ok, so file.eof() would only work (correctly) if the file is opened in binary mode? With that in mind, then file.clear() and file.seekg(...) shouldn't work correctly either? If that's the case, is there any way I can return to the beginning of the file without opening it in binary?

I appreciate the help!

~J

hi, this should be a fairly simple question to answer. I have a function in my program that takes the content of a text file and populates an array with all the information. First it has to see how many lines the file has (as it can change) so it can create a dynamic array based on the number of lines.

Are you restricted to these C-like attempts, or can you use things like strings and vectors and make the read-ahead and subsequent issues just go away?

>>ok, so file.eof() would only work (correctly) if the file is opened in binary mode
No, it works in text mode too. But not the way yo would think it should. fstream only sets the eof flag after an attempt to read beyond the end of the file. The last time getline() is executed it will fail because eof has been reached, the flag is set, but in your original program x is increatemented anyway causing the value of x to be one too many.

>>then file.clear() and file.seekg(...) shouldn't work correctly either
clear() and seekg() work in either binary or text mode. seekg() just has a few limitations in text mode when working in MS-Windows operating system because of the two-byte record terminator in the file system. *nix and MAC don't have that problem because they only use one character record terminators.

ah, I understand, now. Thank you for explaining things, it was a big help.

~J

> file.clear() and file.seekg(...) shouldn't work correctly either?
>If that's the case, is there any way I can return to the beginning of the file without opening it in binary?
file.clear() will work in all cases.
file.seekg(...) will work in all cases if you seek to the beginning of the file.
otherwise, if the streampos that you use for the seek wasn't one returned by an earlier tellg on the same stream/filebuff, all bets are off. c++ throws in locale dependent character code translation, even when the char_type of the stream is just a char. you migh end up on the second byte of a multibyte character sequence or on the lf of a cr-lf sequence.
if you open a file stream (where the char_type of the stream is just a char) in binary mode and imbue the "C" locale, which will always result in a constant character width equal to one, you can seek to an arbitrary position.

file.seekg(...) will work in all cases if you seek to the beginning of the file.
otherwise, if the streampos that you use for the seek wasn't one returned by an earlier tellg on the same stream/filebuff, all bets are off. c++ throws in locale dependent character code translation, even when the char_type of the stream is just a char. you migh end up on the second byte of a multibyte character sequence or on the lf of a cr-lf sequence.
if you open a file stream (where the char_type of the stream is just a char) in binary mode and imbue the "C" locale, which will always result in a constant character width equal to one, you can seek to an arbitrary position.

Huh??? I'm sure what you said is probably accurate, but I have no idea what it is.

seekg() works the same in both text and binary mode on both *nix and MAC, Only in MS-Windows and MS-DOS do we run into trouble with text mode seeks due to the two-byte line terminators. I have no idea how it works on non-English file systems, presumably the same.

> Only in MS-Windows and MS-DOS do we run into trouble with text mode seeks
> due to the two-byte line terminators.
> I have no idea how it works on non-English file systems, presumably the same.
in reality, all we would want is to use tellg to mark a position in an fstream and use seekg to get to that precise location sometime later. this will always work correctly on all streams. problems arise with seek on fstreams (in windows (text mode) or locales using MBC character sets) if we seek to a position that was not earlier returned by a tell. for example, this code (VC++,windows; locale names are not portable) would always work correctly; we are not seeking to arbitrary positions.

#include <iostream>
#include <fstream>
#include <string>
#include <locale>
using namespace std ;

int main()
{
 { ofstream temp("a.txt") ; } // create a new file

 const locale german ( "german" );
 fstream file( "a.txt", ios::in|ios::out ) ;
 file.imbue( german ) ;
 file << "some \n random \n lines\n\n" ;
 
 fstream::pos_type namepos = file.tellg() ;
 file << "name string" << '\n' << '\n' ;
 
 fstream::pos_type phonepos = file.tellg() ;
 file << 12345678 << '\n'<< '\n' << '\n' ;
 
 fstream::pos_type addresspos = file.tellg() ;
 file << "address string\n" ;

 file << "a few more \n\n random \n lines\n\n\n" ;

 string str ;
 cout.imbue( german ) ;

 file.seekg( namepos ) ;
 getline( file, str ) ; 
 cout << str << '\n' ;

 file.seekg( addresspos ) ;
 getline( file, str ) ; 
 cout << str << '\n' ;

 file.seekg( phonepos ) ;
 getline( file, str ) ; 
 cout << str << '\n' ;
 file.seekg( phonepos ) ;
 int phone ;
 file >> phone ; 
 cout << phone << '\n'  ;
 
 cout.imbue( locale::classic() ) ;
 cout << phone << '\n'  ;
}

output:

name string
address string
12.345.678
12.345.678
12345678

Are you restricted to these C-like attempts, or can you use things like strings and vectors and make the read-ahead and subsequent issues just go away?

I'm still curious whether the OP has an answer to this.

ah, I apologize for missing that question. No, I suppose I'm not restricted to doing it this way, but since I'm new to C++, it was the only way I thought of to tackle this. If there are better methods I'd glady implement them, all I need is a nudge in the right direction.

~J

If you use things such as strings and vectors, the read-ahead is not necessary: no seeking and rewinding, just read the data in and be done with it.

ah, I've never dealt with vectors before either (this is turning into quite the learning experience), and you were correct, they work wonderfully. If you remember what my code looked like before (I think I only posted about half of it) here's the whole thing:

vector<string> data_to_array()
{
       vector<string> data;
       string temp="";
       ifstream file("user_data.txt");
       if(file.is_open() && (file_check() == true)) //checks to see if file is open and not empty
       {
                while(file >> temp) //gets words seperated by whitespace
                {
                        data.push_back(temp);
                }
       }
       return data;
}

That's the finished version and works like a charm (but I also thought the other things worked ;) ) Thank you for all your help!

~J

commented: Thank you. This was a diamond in the rough for me. +13
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.