Hi,

Once I had a similar problem when I tried to read a binary file on a pc and I failed, but I was able to read it in Linux (there were actually different versions of the file), so I figured it's a corrupted file, thou the hex dump looked ok.

Now I'm having this problem with a different file, and on both, PC and Linux. The file structure is simple, int (4 bytes) and 15 ascii characters (15 bytes)

When I read it in I get the 1st and the 20th records right, all the rest of them show up wrong.

I'm attaching the binary file with extension .txt since .dat extension seems not to be a valid file extension for attachment on the forum. and I changed the extension in the code as well to reflect this change.

/* read.c */

#include <stdio.h>
#include <stdlib.h>

#define NAMELEN 15

typedef struct
{
    int num;
    //char junk[4];
    char name[NAMELEN];
} rectype;

int main(int argc, char *argv[])
{
    rectype rec;
    FILE *f;

    /* initialize */
    f = fopen("extsort.txt","rb"); /* originally extsort.dat */
    
    /* Temp */
    int i;
    for(i=1; i<35; ++i)
    {
        fread(&rec,sizeof(rectype),1,f);
	    printf("Record #%3d is %15d --> %-15s\n",i,rec.num,rec.name);
    }

    /* process each record in file */
    //while (fread(&rec,sizeof(rectype),1,f) == 1)
    //{
	//printf(" %15d --> %-15s\n",rec.id,rec.name);
    //}

    /* finalize */
    fclose(f);

	system("PAUSE");
	return EXIT_SUCCESS;
}  /* main */

/* end read.c */

Thank you,

Waldis

The problem was the member allignment of the structure. For example, lets take your structure.
It has 15 chars and one int (that is 4 chars). So the size by adding them up is 19 chars. But try outputing sizeof rectype . You will probably get 20. This is because the compiler optimizes the structure so that it occupies an integer number of words. So in this case the next best is 20. (5 words). Therefore if you tell it to read sizeof rectype number of data, it will read 20 chars and you will find that part of data in the next records to be missing as they have been read by the previous records.

One way to get out of this issue is to instruct the compiler to, not allign the structure so that it occupies a full integer multiple of words. You can use the #pragma directive for that.

#pragma pack(1) 
typedef struct
{
      int num;
      //char junk[4];
      char name[NAMELEN];
} rectype;

This will prevent the alligning and the data will be read properly.

Another way is to do this.

for(i=1; i<35; ++i)
{
      fread(&(rec.num), sizeof rec.num,1,f);
      fread(&(rec.name), sizeof rec.name,1,f);
      printf("Record #%3d is %15d --> %-15s\n",i,rec.num,rec.name);
}

That will read 15 chars and one integer rather that the size of rectype, and give you the correct output.

Edit: The allignment is done at member level rather than at the structure level as my above explaination may imply.
for the default padding of 4 ( the members occupy multiples of 4 bytes)

//#pragma pack(1)  commented out. So the default packing will occur.
typedef struct
{
      int num;  // no allignment because it already has 4 bytes
      //char junk[4];
      char name[NAMELEN]; // padded with 1 byte so that it occupies 16 bytes (4 x 4 )
} rectype;

It's brilliant, thank you for saving my head from hitting it against the wall, so to speak.

Thank you,

Waldis

This question has already been answered. Start a new discussion instead.