**

import a data table with space?

**

Good day!
I need to read in all the data in a file within column two and export it in a file. The issue is the data is not standard and has blank fields from time to time. It also has issues like "N/A" that i would want to delete. So i just need some direction. C... is a fickle beast ya know.

The reason being i need to inport a printer file and process the information to export only the document numbers. for example...

10 HIHM07A1 10 Hj NDt
20 N/A 0 NONE PRt
3.4 JJKKM090 DDR

So fscanf will read it in and allow some fun; however, it gets very messy with those spaces. I need it to read in all of column two and export them in a list.

here is where i am so far.

FILE *fp;
fp = fopen("nums.txt","r");
char in1[50], in2[50], in3[50], in4[50], in5[50];


 while(!feof(fp)){
    fscanf(fp,"%s %s %s %s %s",in1, in2, in3, in4, in5);
    printf("%s %s %s %s %s\n",in1, in2, in3, in4, in5);
}

Thanks all

Shane

C... is a fickle beast ya know.

Sounds more like your file format is inconsistent. Any language will be fickle when the incoming data is difficult to parse. Do you have a way of determining whether column 2 will contain embedded whitespace? That's the key question here.

Are the fields fixed width, or delimited in any way. If fixed, then the solution is simple. If delimited it is a bit more difficult, but not very. If neither of those situations is true, then it becomes quite a bit more difficult, especially if more than one field can be blank. If only one field can be blank, then the solution is not simple, but not particularly hard. Please explain which of these scenarios is the case and we can much better advise you how to deal with this situation.

FWIW, I have to deal with this cruft daily, writing tools to process log data that can change at any time.

Thanks all

Let me try to show you the incoming data better, the original post slammbed it together. It is inconsestent but holds a level of consitency within a few cases that I am hoping can be corrected.

example file. print.txt

1234234     HHHII009     DDD   RET   95 //standard line
3332245     N/a          RER            //no image to print
4324345                  rer   889   98 // no image to print
4432443     JJHN9907                    // image without additonal intel

Again the logic hope

import data into an array for each field (import space char when blank)
export array 2 holding only the print files to be printed
process to delete "N/a", "none" and space char.

leaving a list.. ex

HHHII009
JJHM9907

That looks like a fixed width format, so you can probably grab everything from x column to y column and call it good.

Thanks decep, I've taken sometime and tried to figure out how one would do just that... being a newb... i simply don't know. Would you be so kind to tell me a bit more on that?

Thank you!

A fixed width format doesn't change how you read and write files. Extract a whole line, then parse the array to get the data you want. A fixed width format actually makes that easier because now you don't have to parse the data, you have indexes and can simply check that they exist:

char doc_number[DOCNUM_START - DOCNUM_END + 1];
char line[LINE_WIDTH];

if (fgets(line, sizeof line, in) != NULL) {
    /* Parse field data from the line */
    if (strlen(line) < LINE_WIDTH || line[LINE_WIDTH - 1] != '\n') {
        /* Handle a bogus line; error or skip parsing */
    }

    /* Extract a document number: [DOCNUM_START,DOCNUM_END) */  
    doc_number[0] = '\0';
    strncat(doc_number, &line[DOCNUM_START], DOCNUM_END - DOCNUM_START);
}

Some formats allow blank lines, categories where the format changes, or comments, so it's all very case-specific. But it all comes down to a few basic techniques of which the above is the most important.

Edited 3 Years Ago by deceptikon

had some junk happend and i had to put this problem on the back burner. Well i am back and spent some time playing with ways to move around the incoming file to extract just the data i needed. I have the coding running and it works!... once... however it does not loop like i want. The code that i will past, has to offset the first time... and then it can run a loop unitil EOF. can someone take a look and see why this is just looping once? Thank you!!!!

//prep work
    while((c=fgetc(fp))!= EOF){ 
        //skip this space
        while( x <= 48){
            c = fgetc(fp);
            before++;
            x++;
            continue;
            if(c == '\n'){
                count++;
                continue;
                }
            //c = fgetc(fp);
            }
        //print this space

        while((x >= 48) && (c != EOF) && (x <= 58)){
            x++;
            c = fgetc(fp);
            if(c == '\n'){
                count++;
                //break;
                }
            if(c == ' '){
                space++;
                //break;
                }
            else{
                    printf("%c",c);
                }

        }
                    //skip this space

        if((x>=58) && (y >= 0) && (y <= 179)){
                        c = fgetc(fp);
                        y++;
                        dump++;
                        if(c == '\n'){
                        count++;
                        //continue;
                    }
                        //new line
                        if(y == 155){
                        printf("\n");
                        y++;
                        //continue;

                        }
                    }
                    //print this space

            if((y >= 179) && (c != EOF) && (y <= 200)){
                        y++;
                        c = fgetc(fp);

                    if(c == '\n'){
                        count++;
                        //break;
                    }
                    if(c == ' '){
                        space++;
                        //break;
                    }
                    if(y == 200){
                        y = 0; 
                        //break;
                    }
                    else
                    printf("%c",c);
                    //continue;
                    }
    }

I am using counters..... X and Y to control what the code does with the characters.

I think it has to do with the "y" that i am using as the new counter of input characters from the file. I thought reseting it would make it jump back up and go again... not sure and have tried other ways etc... i got nothing.

Thanks again

One concern I have with your code is that x doesn't appear to be reset, so the while loops working with it, are going to go bust.

I don't believe you understood the wisdom of deceptikon's post, right above yours. Your code looks positively PAINFUL. You don't need to work so hard, dude.

lol! thanks Adak.

I wish I had the ability to see that... I shall try again. The joys of teaching yourself your first language.

WEll... anyone else want to take a shot at helping me solve this one?

:)

I'd start over from scratch with this strategy:

  1. Load the entire file into an array of strings. Create a new line for every newline character found. Don't bother with scanning it or doing anything else other than load those lines into memory. If you do this then it lessens the complexity of the code and makes it easier for you to debug it.

  2. Since you are dealing with fixed width if you don't care too much about the data quality then you can simply add NULL characters (0) to specific positions of each line of the loaded file. For instance:

Here is an example of a fixed with file

13  JOHN SMITH     $ 86.20
8   MARY JANE      $112.00
443 RON JOHNSON    $  1.55

The asterisks represent where you will put null characters

13 *JOHN SMITH    *$ 86.20
8  *MARY JANE     *$112.00
443*RON JOHNSON   *$  1.55

So if you do that to each line in strategic places now you can access each field like this:

typedef struct {
    char *Field;
} FieldObject;

typedef struct {
    FieldObject Fields[3];
} LineObject;

typedef struct {
    char *Line;
} BufferedFileObject;

LineObject         Lines[3];
BufferedFileObject LoadedFile[3];

// Load Fields into structures
int i = 0;
for (;i<3;i++) { // Per line
    Lines[i].Fields[0].Field = &LoadedFile[i].Line[0];
    Lines[i].Fields[1].Field = &LoadedFile[i].Line[4];
    Lines[i].Fields[2].Field = &LoadedFile[i].Line[19];
}

// Now you can access each field this way
printf("%s\n", Lines[1].Fields[2].Field); // Prints "MARY JANE     "

Note that this only works if the fixed width field allows for at least one character of space where a NULL can reside. If thats not the case you will need to use memcpy() to copy each field to a malloc'd string where a NULL can be appended to the end of it, then stored into each Field.

Don't forget to free all memory that you malloc.

Edited 2 Years Ago by N1GHTS: Wrong symbol

This article has been dead for over six months. Start a new discussion instead.