1.11M Members

text files and binary files

 
0
 

I am trying to open a text file which contains a dictionary of english words. Each word and it's definition are on the same line and the entries are delimited by a newline. Now, my question is that if you open a text file using fopen() in "rt" mode then do the newlines have a \r\n or just \n? In binary mode does the newline get interpreted as \r\n or just \n? Massive confusion!

 
0
 

From your question, I assume you are using a Windows system? Do you know if the files are in MS, or in Unix/Linux format?

 
0
 

I am using a Windows system, the file is a .txt file

 
0
 

In text mode the newline sequence will be converted to '\n', this is true for any platform. In binary mode you're on your own, no translation will occur so on Windows you need to look for and handle newlines in the form of CRLF.

But it's problematic because you can get a text file formatted using POSIX newlines (just LF rather than CRLF). So if you just look for CRLF or rely on text mode translation the lines might not be split correctly. Fun, huh? ;)

 
1
 

Ok. On Windows, a text newline ('\n') IS a carriage-return+linefeed ('\r\n') combination. You would only need to use the latter representation if you were reading the file from Unix/Linux systems. On Windows, it is still encoded as '\n'. IE, don't sweat it unless you are reading a file from one system type on another and have not passed the file through a filter to convert newlines accordingly, which normally a tool like ftp will do for you if the transfer is specified as text-type. There are also other tools which will convert newlines for you - this is a very common problem.

So, if you execute the function fprintf(outfile, "Hello World.\n"); on Windows, the file will contain a '\r\n' terminator on the line. On Linux/Unix, it would contain only a linefeed ('\n'). Reading back, the same code should work appropriately on either system, making programming applications that is intended to work on both types of systems much easier. Again, problems only occur when you are processing data written on one system type on the other.

And welcome to cross-platform programming and all the little warts you will encounter in that endeavor! :-)

 
0
 

what should I do if I want to detect a new line?

 
0
 

In text mode, look for '\n'. In binary mode, look for '\r' followed immediately by '\n'.

 
0
 

i did that and I keep getting stuck in an infinite loop..i'll post my code in a minute...

 
0
 
/* program used to determine the number of characters and in turn the number of bytes in an
 alphabet entry i.e. number of bytes in 'A', 'B' etc..
 This program also searches for the longest entry in the database i.e. the maximum 
 number of bytes for a given word and it's definition which are found on the same line.
 the program gets stuck in an infinite loop and I don't know why.
*/
#include<stdio.h>
#include<stdlib.h>
int main(){
FILE *fp;
fp=fopen("database.txt","rt");
if(fp==NULL)
{printf("Error opening file!");
exit(1);
}                           // File open and error checking
char ch;
ch= fgetc(fp);              
char alphabet='A';
unsigned long countal[26];  //to store the number of bytes for a particular entry (dictionary is sorted)
int size1=0;
short i=0;
int size=0;
while(alphabet<='Z')     // A through Z, looping through the entire file untile eof.
{
unsigned long chars=0;
if(ch==alphabet)            /* if found then increment the number of bytes and check the size
{                              of a given entry */
while(ch!='\n')             // infinite loop??
{chars++;
size++;
ch=fgetc(fp);
}

}
else
{

while(ch!='\n')
{size++;
ch=fgetc(fp);
}

}

if(size>size1)
size1=size;
size=0;
ch=fgetc(fp);
countal[i]=chars;
i++;
if(ch==EOF)
{
alphabet++;
rewind(fp);
}
}

printf("Largest directory entry: %d\n",size1);
char abcd='A';
for(i=0;i<26;i++)
{
printf("%c= ",abcd);
printf("%u bytes\n",countal[i]);
abcd++;
}
fclose(fp);
return 0;
}
 
0
 

When you have a string of words - here a word, and then it's definition, on the same line, you want to use fgets() and put the entire line into a char array (I use "buffer", all at once.

The newline will be included on the end of the buffer (space permitting), so now using strlen(buffer) you can get the full size. Easy smeazy.

while((fgets(buffer, sizeof(buffer), filePointer))!= NULL) {
   //your other code in here
}

Remember to make buffer longer than any possible line of text, and you're good to go. A word, plus a definition, may be a line longer than 200 chars - so think 500 for starters.

 
0
 

Hey thanks for your valuable suggestion! :):)

 
0
 

Does the function fgets() increment the file pointer internally to point to the next line??

 
0
 

Does the function fgets() increment the file pointer internally to point to the next line??

Yes, all of the standard I/O functions adjust the file position accordingly.

You
This article has been dead for over six months: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article