Is there a way to read a file character by character for chars and the number set for numbers?

bob 4567
joe 39083
sara 4239824

That is my file

while ((c = fgetc(pFile)) != EOF)
{
    //my work
}

I know this works on my chars. This won't read numbers in the I want. This reads b,o,b,space,4,5,6,7. I want it to read b,o,b,space,4567. The reason for this is I will be adding the 4567 plus 39083 plus 4239824 eventually.

For numbers you have to do a little more processing -- convert the ascii value of the char to binary then add it to an integer. This works for English language and standard ascii character set.

I have no idea what you intend to do with the number, but whatever it is you would do it when a space is encountered.

Note: You may have a end-of-file problem with this code because it might ignore the last character in the file. It's a lot easier to read the file on entire line at a time and then split it in memory.

int num = 0;
while ((c = fgetc(pFile)) != EOF)
{
   if( isspace(c) )
   {
      num = 0;  
   }
   else if( isdigit(c) )
   {
       num += (c - '0');
   }
   else
   {
       // neither a space or a digit
   }
}

Edited 2 Years Ago by Ancient Dragon

You need to build up the number as digits are read, but not if non-digits are read. The trick is handling input so that you don't miss anything such as a number at the end of the file, which means not using a comparison against EOF as the loop condition:

#include <stdbool.h>
#include <ctype.h>
#include <stdio.h>

int main(void)
{
    FILE *in = fopen("input.txt", "r");

    if (in != NULL)
    {
        bool in_number = false;
        long long num = 0;

        while (true)
        {
            int ch = getc(in);

            if (!isdigit(ch))
            {
                if (in_number)
                {
                    printf("%lld\n", num);

                    // Reset for the next number
                    in_number = false;
                    num = 0;
                }
                else if (ch != EOF)
                {
                    printf("%c\n", ch);
                }

                if (ch == EOF)
                {
                    break;
                }
            }
            else
            {
                num = 10 * num + (ch - '0');
                in_number = true;
            }
        }

        fclose(in);
    }

    return 0;
}

However, there's still a problem. If you truly want to store the number as an integral value, you have to consider cases where the value will overflow. I chose to use a long long to limit the chance of that, but it still exists in an arbitrary file.

Doing an overflow check manually is harder that it might seem, especially when using the largest possible integer data type. A safer approach might be to store the number as a dynamic string and then use something like strtoll to do the conversion and error checking for you. I'll leave that as an exercise for you because it's what I'd consider to be the ultimate solution.

You appear to be processing a file that contains a list of 1 line records. In you case the records are simple "<name> <number>" however I would always approch this by reading the entire record before processing it

while(Read Line From File = Success)
{
    Process Line
}

You could use fgets to read the line from the file but this does have the limitation that you need to know your maximum line size ahead of time. Alternatively you could create your own function that reads a line of any size using fgetc.

Again I would write a function to process the line once read spliting the processing of the data from the reading the data from a file. Then if at a later date you need to obtain the data from a different source, say a tcp/ip link, you can still use the same processing function.

@Ancient Dragon
Can you tell me what the single quotes are for with '0'? I was using an int. I thought the single quotes are for char.

I intend to add the numbers then store them in an array. For the char I intend to add the ascii value of each char and mod them, then store them in an array.

How would I go about reading an entire line at a time? Not all lines will have numbers so is that possible?

Can you tell me why this caused a big mess? My plan was just to add a single number at a time to an array then use atoi to convert it to an integer. Unfortunately I was never able to get to that step. I tried to use the simple number of 25. Unfortunately it seems fgetc really doesn't like numbers. Somehow it turned 25 into 2,50,5,53. I really don't understand how this is possible. I tried with single quotes around my 0 and 9 since I have seen it done both ways.

while ((c = fgetc(pFile)) != EOF)
{
        printf(" C is %c .\n", (char)c);
        printf(" C is %d .\n", c);
        if (c >= '0' && c <= '9' )
        {
            //next one
            printf("third if.\n");
            char int_holder[100];
            int i = 0; 
            while(c >= '0' && c <= '9')
            {
                //start
                int_holder[i++] = c;
                //printf("i is %d \n", i);  
            }
            sum = atoi(int_holder); 
            printf("The sum is %d \n", sum);
            int_holder[i++] = '\0';
            printf("int_holder is %s \n", int_holder); 
        }
}

@deceptikon
Why are you using getc? Does it play nicer with numbers than fgetc? Can you explain this '0'? Why do you need single quotes when num is an int? What does this line do?

num = 10 * num + (ch - '0');

@Banfa
I assume you mean read line by line? How would you go about doing this? Would you use fscanf? I'm not a fan of fscanf since I always seem to mess it up :(. My file will be random. It will have characters first and sometimes there will be a group of numbers after it and sometimes there won't be a group of numbers after it.

If you mean in this line num += (c - '0');

It is subtracting the ascii value of 0 (which if you look at any ascii table '0' has an ascii value of 48) from whateve is in c to convert from ascii to binary decimal. If the value of c is '0' then the equation c-'0' is 48-48 = 0.

For the char I intend to add the ascii value of each char and mod them, then store them in an array.

Why?? mod will not convert from ascii to numeric.

Not all lines will have numbers so is that possible?

use fgets() to read the entire line, then parse it. If some line have numbers and other lines don't, then using scanf() will not work because it will error out then %d is encountered. You'll just have to parse it manually to find out if a character is a numeric digit or something else.

The loop you posted on line 11 doesn't work because the value of c never changes. Don't blaim fgetc() for the bugs in your code.

Edited 2 Years Ago by Ancient Dragon

@COKEDUDE

I would never use fscanf, scanf, sscanf, vfscanf, vscanf, vsscanf or any other scanf derivertive on principle (the principle being they are a nightmare and cause problems and process white space for you). On the other hand I generally try and avoid writing software that has any human interaction since they are just messy and unreliable.

As I said in my original post if you are willing to accept the limitation of a maximum line length then I would fgets to read the lines of text and that would certainly do to start with. If you are unable to live with that limitation then you should write your own function to get a line from a file based given a file identifier based on fgetc (or getc the only difference being that the platform can choose to implement getc as a macro).

If you wanted to future proof yourself you could encapsulate the call to fgets in another function and then if you need to move to a different implementation at a later date you can do it just by changing your function implementation without chantging the rest of the code; e.g.

// Encasulate fgets so that we can change the implentation later if required
char *getLine(FILE* file) // Returns an allocated buffer so remember to free it, returns NULL if end of file
{
#define MAX_LINE_LENGTH 500
  char *buffer;

  buffer = malloc(MAX_LINE_LENGTH);

  if (buffer != NULL)
  {
    char *result = fgets(buffer, MAX_LINE_LENGTH, file);

    // If we got to the end of file return NULL, free the buffer otherwise we get a memory leak
    if (result == NULL)
    {
      free(buffer);
      buffer = NULL;
    }
  }

  return buffer;
}

Why are you using getc? Does it play nicer with numbers than fgetc?

They do exactly the same thing. The only difference is that getc may be implemented as a macro, which in most cases makes no functional difference. Generally it only matters if the argument has side effects or you're trying to store a pointer to a function. In those two cases, fgetc should be used instead.

Edited 2 Years Ago by deceptikon

@Ancient Dragon

The reason for modding it is I want to use a hash table to check if I have already stored a value. If I have I will need to deal with collission. I'm trying to break this into small pieces :).

This was the output after I read 25 from a file. I was planning to fix the while loop after I fix the reading issues. I don't understand how 25 turns into 2,50,5,53.

 C is 2 .
 C is 50 .
 C is 5 .
 C is 53 .
 C is 

@Banfa
500 sounds good to me :). Thats also good that you can reuse the code.

@deceptikon
Thank you for explaining that.

This article has been dead for over six months. Start a new discussion instead.