1,105,320 Community Members

text files and binary files

Member Avatar
anumash
Junior Poster in Training
51 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

I am trying to open a text file which contains a dictionary of english words. Each word and it's definition are on the same line and the entries are delimited by a newline. Now, my question is that if you open a text file using fopen() in "rt" mode then do the newlines have a \r\n or just \n? In binary mode does the newline get interpreted as \r\n or just \n? Massive confusion!

Member Avatar
rubberman
Senior Poster
3,989 posts since Mar 2010
Reputation Points: 513 [?]
Q&As Helped to Solve: 500 [?]
Skill Endorsements: 87 [?]
 
0
 

From your question, I assume you are using a Windows system? Do you know if the files are in MS, or in Unix/Linux format?

Member Avatar
anumash
Junior Poster in Training
51 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

I am using a Windows system, the file is a .txt file

Member Avatar
deceptikon
Eternally Awesome
4,669 posts since Jan 2012
Reputation Points: 1,339 [?]
Q&As Helped to Solve: 679 [?]
Skill Endorsements: 104 [?]
Administrator
Featured
 
0
 

In text mode the newline sequence will be converted to '\n', this is true for any platform. In binary mode you're on your own, no translation will occur so on Windows you need to look for and handle newlines in the form of CRLF.

But it's problematic because you can get a text file formatted using POSIX newlines (just LF rather than CRLF). So if you just look for CRLF or rely on text mode translation the lines might not be split correctly. Fun, huh? ;)

Member Avatar
rubberman
Senior Poster
3,989 posts since Mar 2010
Reputation Points: 513 [?]
Q&As Helped to Solve: 500 [?]
Skill Endorsements: 87 [?]
 
1
 

Ok. On Windows, a text newline ('\n') IS a carriage-return+linefeed ('\r\n') combination. You would only need to use the latter representation if you were reading the file from Unix/Linux systems. On Windows, it is still encoded as '\n'. IE, don't sweat it unless you are reading a file from one system type on another and have not passed the file through a filter to convert newlines accordingly, which normally a tool like ftp will do for you if the transfer is specified as text-type. There are also other tools which will convert newlines for you - this is a very common problem.

So, if you execute the function fprintf(outfile, "Hello World.\n"); on Windows, the file will contain a '\r\n' terminator on the line. On Linux/Unix, it would contain only a linefeed ('\n'). Reading back, the same code should work appropriately on either system, making programming applications that is intended to work on both types of systems much easier. Again, problems only occur when you are processing data written on one system type on the other.

And welcome to cross-platform programming and all the little warts you will encounter in that endeavor! :-)

Member Avatar
anumash
Junior Poster in Training
51 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

what should I do if I want to detect a new line?

Member Avatar
deceptikon
Eternally Awesome
4,669 posts since Jan 2012
Reputation Points: 1,339 [?]
Q&As Helped to Solve: 679 [?]
Skill Endorsements: 104 [?]
Administrator
Featured
 
0
 

In text mode, look for '\n'. In binary mode, look for '\r' followed immediately by '\n'.

Member Avatar
anumash
Junior Poster in Training
51 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

i did that and I keep getting stuck in an infinite loop..i'll post my code in a minute...

Member Avatar
anumash
Junior Poster in Training
51 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 
/* program used to determine the number of characters and in turn the number of bytes in an
 alphabet entry i.e. number of bytes in 'A', 'B' etc..
 This program also searches for the longest entry in the database i.e. the maximum 
 number of bytes for a given word and it's definition which are found on the same line.
 the program gets stuck in an infinite loop and I don't know why.
*/
#include<stdio.h>
#include<stdlib.h>
int main(){
FILE *fp;
fp=fopen("database.txt","rt");
if(fp==NULL)
{printf("Error opening file!");
exit(1);
}                           // File open and error checking
char ch;
ch= fgetc(fp);              
char alphabet='A';
unsigned long countal[26];  //to store the number of bytes for a particular entry (dictionary is sorted)
int size1=0;
short i=0;
int size=0;
while(alphabet<='Z')     // A through Z, looping through the entire file untile eof.
{
unsigned long chars=0;
if(ch==alphabet)            /* if found then increment the number of bytes and check the size
{                              of a given entry */
while(ch!='\n')             // infinite loop??
{chars++;
size++;
ch=fgetc(fp);
}

}
else
{

while(ch!='\n')
{size++;
ch=fgetc(fp);
}

}

if(size>size1)
size1=size;
size=0;
ch=fgetc(fp);
countal[i]=chars;
i++;
if(ch==EOF)
{
alphabet++;
rewind(fp);
}
}

printf("Largest directory entry: %d\n",size1);
char abcd='A';
for(i=0;i<26;i++)
{
printf("%c= ",abcd);
printf("%u bytes\n",countal[i]);
abcd++;
}
fclose(fp);
return 0;
}
Member Avatar
Adak
Posting Virtuoso
1,711 posts since Jun 2008
Reputation Points: 419 [?]
Q&As Helped to Solve: 207 [?]
Skill Endorsements: 10 [?]
 
0
 

When you have a string of words - here a word, and then it's definition, on the same line, you want to use fgets() and put the entire line into a char array (I use "buffer", all at once.

The newline will be included on the end of the buffer (space permitting), so now using strlen(buffer) you can get the full size. Easy smeazy.

while((fgets(buffer, sizeof(buffer), filePointer))!= NULL) {
   //your other code in here
}

Remember to make buffer longer than any possible line of text, and you're good to go. A word, plus a definition, may be a line longer than 200 chars - so think 500 for starters.

Member Avatar
anumash
Junior Poster in Training
51 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Hey thanks for your valuable suggestion! :):)

Member Avatar
anumash
Junior Poster in Training
51 posts since Jan 2011
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Does the function fgets() increment the file pointer internally to point to the next line??

Member Avatar
deceptikon
Eternally Awesome
4,669 posts since Jan 2012
Reputation Points: 1,339 [?]
Q&As Helped to Solve: 679 [?]
Skill Endorsements: 104 [?]
Administrator
Featured
 
0
 

Does the function fgets() increment the file pointer internally to point to the next line??

Yes, all of the standard I/O functions adjust the file position accordingly.

You
This article has been dead for over three months: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article