I am working on a program in ProC that uses fscanf to read a CSV file. This is a naive CSV file, with no commas allowed in the strings, and it is packed, with no whitespace between the fields. I read in a header record, which has some important information, including the number of detail records in the file. I use the number of detail records to read the rest of the file, in a for loop, one detail record at a time. I use fgets to clean out each read-in line, before the next read of the file. The scan set I am using to read each string is %[^,\n]%c%, which works fine as long as every field has a value. If there is a blank field, such as ,, fscanf then stops reading the data in the line. This is the only scan set I could get to work with this packed, naive CSV file, but it has to be able to handle blank fields. A sample file I am trying to read is HDR,1,ACC,ES,427,FRB427ESACC20120731020000,31/07/2012,2:00:00,clientes6@atesa.es,,,,,,
DET,1420065,ES,427,VIAJES EL CORTE INGLES S.A.,C/MAYOR 68,ALCANTARILLA,30820,MURCIA,,968895895,968893770,,A28229813,N

The header record is the first line, and that reads just fine. The second and third line, starting with the field DET, is the one detail record in this file. It reads the data without problem until the field after MURCIA, which is a blank field. After that, fscanf does not read any of the data in the rest of the record. A sample of the debug output follows:

Header record type is HDR
record count is 1
fileType is ACC
recordType value is DET
accountID is 1420065
ctryCode is ES
partnerCde is 427
acctName is VIAJES EL CORTE INGLES S.A.
addr1 is C/MAYOR 68
addr2 is ALCANTARILLA
addr3 is 30820
addr4 is MURCIA
addr5 is
telephoneNum is
faxNum is
emailAddr is
accountVATNum is
read_validate_DET:Invalid status value of @}Ð sent for account 1073904936, must be N or S.

The status field that is the final error message is for the last field in the detail record, which is properly set to N, but fscanf cannot read it.

Any help with this is greatly appreciated.

Unfortunately this isn't the kind of variation that scanf() is designed to handle gracefully. What's happening is scanf() is detecting the sentinel character before saving any characters for a field, thus an empty field denotes failure to perform the conversion for the scanset.

You'll need to check for two adjacent commas outside of scanf() and recognize that as a blank field, then clean up the source stream such that scanf() doesn't fail to match the string. For example using sscanf() (such as if you're pulling the line first with fgets()):

#include <stdio.h>

int main()
{
    const char *src = 
        "DET,1420065,ES,427,VIAJES EL CORTE INGLES S.A.,"
        "C/MAYOR 68,ALCANTARILLA,30820,MURCIA,,968895895,"
        "968893770,,A28229813,N";
    char field[1024];
    size_t pos = 0, n = 0;

    while (sscanf(src + pos, "%1023[^,\n]%*c%n", field, &n) == 1) {
        printf("'%s'\n", field);

        pos += n;

        if (src[pos] == ',') {
            puts("Blank field");
            ++pos;
        }
    }

    return 0;
}

And another using the slightly more awkward (in my opinion) get/unget method if you're reading directly using fscanf() or scanf():

#include <stdio.h>

int main()
{
    char field[1024];
    size_t pos = 0, n = 0;
    int ch;

    while (scanf("%1023[^,\n]%*c", field) == 1) {
        printf("'%s'\n", field);

        ch = getc(stdin);

        if (ch != ',' && ch != '\n')
            ungetc(ch, stdin);
        else
            puts("Blank field");
    }

    return 0;
}
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.