Hello, I just starting with C and I am having a problem that I can´t understand. Here is the code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

FILE *M;
M=fopen( "m", "r");
char line [ 540 ];
long int i=0;
struct estr_reg {
char *molid;
char *molc;
};
struct estr_reg reg[6];

while ( fgets ( line, sizeof line, M ) != NULL )
{
reg.molid=strtok(line,";");
reg.molc=strtok(NULL,";");
i++;
};
fclose ( M );
printf("%s %s", reg[0].molid,reg[1].molid);
return 0;
}

the problem is that the output of this printf is always "5 5" that is the value of reg[6].molid (the last element of reg). If I do the printf inside the "while (fgets...", the value displayed is correct. But outside the while, it is as if all the elements of the array collapse into one (the last one). Why is this???????

Thanks!!!

You need:

1) To put code tags around all your code, for the forum. Otherwise your code is very hard to study. Highlight it, and click on the CODE icon in the editing window.

2) Show an example of the input - just two or thee lines is enough
3) Show an example of the output you want

It appears (of course), that you are making an error with strtok()

strtok(string1, string2);

(Look! Code tags!!) ;)

scans string1 for the first token matching string2. Why the NULL in yours?

Edited 3 Years Ago by Nick Evan: Fixed formatting

here is the input file:

0;13215447
1;3270197991
...
...
5;3231730507

as output, I expect that this line:

printf("%s %s", reg[0].molid,reg[1].molid);

returns "0 1" and not "5 5" as it is doing

I think that strtok is not the problem, because when I print the obtained values inside the while loop and after applying that function, everything is ok. the problem appears when I print the values outside the while loop

I think that all the elements of the array reg[] are being stored in the same place... but I don't understand why is that and I know how to fix it...

I should have seen this - you are using the char *molid, etc., and they have never been given a valid address.

usually, that causes a seg fault.

and how can I give a valid adress? sorry if this is too elemental, I am used to program with perl and i am just learning c

Well...

Forget what I was saying about a valid address -- your program has no main() function, so it had no insertion address, itself.

Let's start with this, for your code, and work on it, from there:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

struct estr_reg {
char *molid;
char *molc;
};
int main() {

  FILE *M;
  char line [ 540 ];
  long int i=0;
  struct estr_reg reg[6];

  M=fopen( "m", "r");

  while ( fgets ( line, sizeof line, M ) != NULL )
  {
    reg[i].molid=strtok(line,";");
    reg[i].molc=strtok(NULL,";");
    i++;
  };
  fclose ( M );
  printf("%s %s", reg[0].molid,reg[1].molid);
  printf("\n\n\t\t\t    press enter when ready");
  i=getchar();
  return 0;
}

That will at least compile. Back in a bit.

that still doesn't work, in the line printf("%s %s", reg[0].molid,reg[1].molid), instead of showing 0 1, it prints 1 1... I guess all the values in the array overwrite the previous one, but I don't understand why

Wow, this one is bending my head around like the possessed girl in "Exorcist"!

It isn't putting the data anywhere else but on the first element of the array of structs, even inside the while loop.

Right now, I just don't see it. Wish I had more time right now for it. :(

The second reg.moldid, should be reg.modc, imo, but that's not a biggy right now. I'll pass it up the chain to some friends. See what they say.

Would you post up a few lines of text from the m file, so I know I've got the right data for it?

Edited 6 Years Ago by Adak: n/a

All the pointers in the reg array end up pointing to the one and only buffer line that you have. So, every time fgets() grabs new data, any pre-set pointers in the array automatically point to the newly fetched data - the previous data is simply overwritten.

Edited 6 Years Ago by mitrmkar: typo

I don't have the "m" file, but this works via keyboard entry.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

struct estr_reg {
  char *molid;
  char *molc;
};

int main() {

  FILE *M;
  char *ptr;
  char line1 [ 540 ]={""};
  int i=0, numEntries;
  struct estr_reg reg[6];

  //M=fopen( "m", "r");
  printf("\n Enter a number followed by a semi-colon, and then a name.");
  printf("\n Enter q to quit entry. \n");
  while ((fgets(line1, sizeof line1, stdin)) != NULL )
  {
    if(line1[0]=='q')
      break;
    ptr=strtok(line1,";");
    strcpy(reg[i].molid, ptr);
    ptr=strtok(NULL,";");
    strcpy(reg[i].molc, ptr);
    i++;
  };
  //fclose ( M );
  numEntries=i;
  for(i=0;i<numEntries;i++)
    printf("%s %s", reg[i].molid,reg[i].molc);
  printf("\n\n\t\t\t    press enter when ready");
  i=getchar();
  return 0;
}

but there are several lines in the input file M...

0;13215447
1;3270197991
....
....

as I said, I usually use perl, not C. may be I am confused because of that

Paula - it's OK. I had to change your program because I didn't have the "m" file. The logic is now OK, so just use your lines of code that I //marked out, and go ahead and mark out (or delete as you wish). My lines of code that relate to the keyboard entry.

Replace "stdin" in the fgets() statement, with M, and remove the // from the fopen() line of code.

If you have any other questions, ask away.

adak, I tried your code, but it gives a segmentation fault (both trying your code directly or modifying it to use the M file)

Adak forgot to allocate memory to molid and molc. For testing purposes, just make those large arrays and see if it works better:

struct estr_reg {
  char molid[BUFSIZ];
  char molc[BUFSIZ];
};

Post the code you're using right now, and a sample file.

[edit]
Here's another sample to test with:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct estr_reg {
    char *molid;
    char *molc;
};

char *clone_str(const char *s)
{
    char *p = malloc(strlen(s) + 1);
    
    if (p != NULL)
        strcpy(p, s);

    return p;
}

int main(void)
{
    FILE *in = fopen("data.txt", "r");

    if (in) {
        struct estr_reg records[10];
        char line[BUFSIZ];
        size_t i, n = 0;

        while (n < 10 && fgets(line, sizeof line, in) != NULL) {
            line[strcspn(line, "\n")] = '\0';
            records[n].molid = clone_str(strtok(line, ";"));
            records[n].molc = clone_str(strtok(NULL, ";"));
            ++n;
        }

        fclose(in);

        for (i = 0; i < n; i++) {
            char *molid = records[i].molid != NULL ? records[i].molid : "(null)";
            char *molc = records[i].molc != NULL ? records[i].molc : "(null)";

            printf("%s\t%s\n", molid, molc);

            free(records[i].molid);
            free(records[i].molc);
        }
    }

    return 0;
}

[/edit]

Edited 6 Years Ago by Narue: n/a

thank you now it works. i have another problem, if the file i am reading is bigger (1000 records) i get a segmentation fault. i am trying big files because i will have to read 800000 records aprox

i am sorry, i made a mistake, it gives a segmentation fault because i am trying to read a file of 156 M aprox, how can i do?

You're trying to keep 800000 records in memory at once? I strongly suspect that's unnecessary.

this is what I have to do: I have a very long file (800000 lines), each line containing an identifier and 16 long integer numbers.

line 0: id(0), n1(0),n2(0),...n16(0)
line 1: id(1), n1(1),n2(1),...n16(1)
...
line 800000: line 1: id(800000), n1(800000),n2(800000),...n16(800000)

I have to take the first line, perform a "binary and" between the first number of line 0 and the first number of line 1, etc:
n1(0) & n1(1)
n2(0) & n2(1)
...
n16(0) & n16(1)

etc, and perform this for all the possible combinations of lines (to compare line0 with line1 is the same as comparing line1 with line0, so once I did 0-1 I don't have to do 1-0).

Then I have to do other things but that is basically why I was trying to keep everything in memory

this is very easy to do with perl, the problem is that I started running the script and I realized that it will take 50 days to finish, and I was told that these kind of binary things are much faster with C

I think your issue will be more algorithm-based than language. How are you generating the combinations?

malloc (and calloc), are slow, compared to static arrays. But if you need a BIG array, then it should be the way to go.

You have to consider:

1) there are advantages to allocating memory all at once, instead of "every time through the loop, allocating just enough for one record".

2) test your system, and see how large you can allocate your array, without slowing down your whole system.

3) can you trim down the number of records you need to have in memory at one time, to work within the size you saw in your test in #2?

4) C is faster. 4.6 X faster than C# in a test involving mass testing a few hundred thousand tuples. (Where tuples with 2 matching values had to be found from amongst all the other tuples).:

http://cboard.cprogramming.com/c-programming/129611-c-real-life.html

is a summary of the 5 page thread (which is a fun read itself, and also linked).

WAY faster than Python and other interpreted languages, in general.

Each idea with merit, should be tested.

Can you give more details of what the program is doing, and maybe a sample data file for testing?

Perhaps post a file that takes 5 seconds or so to process in Pearl. Then we can compare that with what we find best, for C.

If you post the data file to Swoopshare (a free file depot), and then post the url to that file, size should not be a problem.

Edited 6 Years Ago by Adak: n/a

Narue: in C I didn´t get to the point of generating the combinations yet. In perl, I am doing a double for loop, where the inside loop is forced to have a starting value that is equal than the outer loop, in this way I can perform all the combinations without repeating them

for (i=0, i<x,i++) {
for (j=i,j<x,j++) {

That's how we programmed the mass tuples checking program, also. It worked very well in our tests.

j should be assigned the value of i+1, rather than i, and i should have a terminating test of i<x-1, instead of i<x.

Have you found out what your largest array you could allocate, yet?

no, i didn't do it yet, i will post it as soon as i have it.

j is not i+1, because I am also interested in the combination of a line with itself

thank all of you for your ideas =)

This article has been dead for over six months. Start a new discussion instead.