Parsing a String into Tokens Using strcspn 1

Dave Sinkula 0 Tallied Votes 1K Views Share

Many times strtok is recommended for parsing a string; I don't care for strtok . Why?

  • It modifies the incoming string, so it cannot be used with string literals or other constant strings.
  • The identity of the delimiting character is lost.
  • It uses a static buffer while parsing, so it's not reentrant.
  • It does not correctly handle "empty" fields -- that is, where two delimiters are back-to-back and meant to denote the lack of information in that field.

This snippet shows a way to use strcspn to parse a string into fields delimited by a character (a semicolon in this case, but commas or tabs or others could be used as well).

See also Parsing a String into Tokens Using strcspn 2 and Parsing a String into Tokens Using strcspn 3.

Thanks to dwks for asking why not to use strtok.

#include <stdio.h>
#include <string.h>

int main(void)
{
   static const char filename[] = "file.txt"; /* the name of a file to open */
   FILE *file = fopen(filename, "r"); /* try to open the file */
   if ( file )
   {
      char line[BUFSIZ]; /* space to read a line into */
      int k = 0;
      while ( fgets(line, sizeof line, file) ) /* read each line */
      {
         int i;
         char *token = line; /* point to the beginning of the line */
         printf("line %d:\n", ++k);
         for ( i = 0; *token; ++i ) /* loop through each character on the line */
         {
            /* search for delimiters */
            size_t len = strcspn(token, ";\n"); 
            /* print the found text: use *.* in format to specify a maximum print size */
            printf("token[%2d] = \"%*.*s\"\n", i, (int)len, (int)len, token);
             /* advance pointer by one more than the length of the found text */
            token += len + 1;
         }
      }
      fclose(file);
   }
   return 0;
}

/* file.txt
2;31May2005;23:59:00;100.2.1.12;log;accept;;eth9;inbound;VPN-1
FireWall-1;;100.1.20.130;172.30.7.114;icmp;191;;;8;0;;;;
*/

/* my output
line 1:
token[ 0] = "2"
token[ 1] = "31May2005"
token[ 2] = "23:59:00"
token[ 3] = "100.2.1.12"
token[ 4] = "log"
token[ 5] = "accept"
token[ 6] = ""
token[ 7] = "eth9"
token[ 8] = "inbound"
token[ 9] = "VPN-1"
line 2:
token[ 0] = "FireWall-1"
token[ 1] = ""
token[ 2] = "100.1.20.130"
token[ 3] = "172.30.7.114"
token[ 4] = "icmp"
token[ 5] = "191"
token[ 6] = ""
token[ 7] = ""
token[ 8] = "8"
token[ 9] = "0"
token[10] = ""
token[11] = ""
token[12] = ""
token[13] = ""
*/
kokopo2 0 Newbie Poster

Hi guys, i like to thank all who has helped me so far. I managed to come up with some sort of code, to find the invalid string. e.g -8ha .
However im having problems getting it to print whether is it a invalid string once only. Also if there is a number in the invalid string it will print the number out too.

Also, if it is a negative number, the program will print it as invalid string too.

/*the output is .
Contents of file : 12 3 + * -79 -8ha

12 number

3 number

+ not number

* not number

-79 invalid
invalid
invalid
number

-8ha invalid
invalid
number
*/

i managed to print the positive number and operator correctly,
however i just cant seem to print the negative number and invalid string correctly.
can some1 help pls?
thanks in advance...

#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>



int main()
{
  FILE * mFile;
  char string[100]; /* Input line from file           */
  char st[100];     /* Temporary storage for the line */

  char nt[4];
  char *tokenPtr;

  mFile = fopen ("myfile.txt" , "r");
  if (mFile == NULL) { /* If there was a problem, show the error */
    perror ("Error opening file");
  }
  else {
    fgets (string , 100 , mFile);
    printf("The original line: >%s<\n", string);

    strcpy(st, string);

    tokenPtr = strtok(st, " \n");

    while( tokenPtr != NULL ){
      printf("%s\n", tokenPtr);

		for( int x = 0; tokenPtr[ x ] != '\0'; x++ )
		{
			
			/* to find invalid string, e.g. -8ha*/
   			if( isalpha( tokenPtr[ x ] ) )
       		int a;
			 a = 1;


				if( isdigit( tokenPtr[ x ] ) )
				int b;
				b = 1;

					if( a == 1 && b == 1)
						{
							printf("invalid\n");
						}
		}

		/*to find number range from negative to positive and operator*/
				int i;
			i= atoi( tokenPtr );
				if( i != '\0' )
					printf(" Number\n");
				else
					{
						printf("not a number\n");
					}


      tokenPtr = strtok(NULL, " \n");
      }

  }
return 0;
}
prasannalakshmi 0 Newbie Poster

printf("token[%2d] = \"%*.*s\"\n", i, (int)len, (int)len, token);
What does it means? please explain me.

prasannalakshmi 0 Newbie Poster

printf("token[%2d] = \"%*.*s\"\n", i, (int)len, (int)len, token);
What does it means? please explain me.

Triztian 0 Light Poster

May I ask, instead of just printing the tokens, how would we assign them to a variable?

Aia 1,977 Nearly a Posting Maven

May I ask, instead of just printing the tokens, how would we assign them to a variable?

Follow link to part three.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.