i don't recommend atoi() or atol(). this does not handle invalid strings (non-integer strings) very well at all. for instance, if you try

value = atoi("CowAndChicken");

the result will be zero (value = 0)

and if you try

value = atoi("0");

the result will also be zero (value = 0)

strtol() avoids this confusion by allowing you to tell the difference between an actual zero, and an invalid string. it also will convert number bases other than base 10. look up the strtol() function here: http://www.cplusplus.com/reference/clibrary/cstdlib/strtol/

And I would use both "strtok" to parse the strings between the separator characters (which could be "/" or "-" or ".") in conjunction with "strtol", to properly convert each numeric string.

this is the best method, IMO, because it will allow you to handle conditions where you have extra spaces or invalid characters, or differently formatted month/date numbers.

for instance, consider all the differences between:

"01/01/10"

"1/1/10"

"1/01/2010"

"01 - 01 - 2010"

"1-1-10"

"01.01.2010"

etc.

Recommended Answers

All 21 Replies

What jephthah says is correct, but by not passing "cowandchicken" into atoi() you won't have a problem.


You need to look at the characters anyway. This is the verification phase of any conversion. Make sure you pass digits at the front of the string and you're fine. Anyone that would blindly pass a non-verified string into a conversion function gets what he deserves.

jephthah> And I would use both "strtok" to parse the strings between the separator characters
I do not care about strtok() much. It doesn't work with read-only strings like the literal "this string", since it has to modify the given string to token it.

I do not care about strtok() much. It doesn't work with read-only strings like the literal "this string", since it has to modify the given string to token it.

And that's why I wouldn't use it at all. How do you know the user entered 2/5/2000? Maybe he entered 2-5-2000, or 2.5.2000. If you have to do that much verification, you might as well use atoi() because you already know what is in the string. Or better yet, convert the values yourself like I suggested and actually learn something :icon_wink:

Aia brought up one of the reasons I recommended strtol() instead of strtok(). With strtol() you can make each of the conversions in one single line of code (one for each variable), and no loops.

i like how some people seem to assume that they can somehow guarantee the format and content of strings being input by an end user or some other uncontrolled process.

the fact is, nobody ever knows for certain what will wind up being passed in as a string argument. data can be corrupted even if everything else is "guaranteed"

that's why you need to use strtol() to convert numbers, rather than the non-error-checking atol/atoi

and strtok() is powerful enough to be able to handle multiple types of token delimeters. so, when combined with even a modest amount of error checking, strtok and strtol can make a nearly bulletproof string-to-numeric converter.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
#define MAX_DATESTR_LEN                16
#define DATESTR_DIVIDER_CHARACTERS     "/\\-_.,"  // allowable delimiters for month date year

int convertDateStringToVals (char*, const char*, int*, int*, int*);
int getUserInput(char*, int);

int main(void)
{
   char  dateStr[MAX_DATESTR_LEN];
   int   inputLen = 0,
         month,
         date,
         year,
         isInvalid;

   while (1)
   {
      do {
         printf("\nEnter Date String : ");
         fflush(stdout);
         inputLen = getUserInput(dateStr,MAX_DATESTR_LEN);
         if (inputLen > MAX_DATESTR_LEN)
            printf("string length %d, max length is %d, try again.\n\n",inputLen, MAX_DATESTR_LEN);
      }
      while (inputLen > MAX_DATESTR_LEN);

      if (inputLen == 0)
         break;

      isInvalid = convertDateStringToVals(dateStr, DATESTR_DIVIDER_CHARACTERS, &month, &date, &year);
      printf("   month : %02d\n   date  : %02d\n   year  : %d\n", month, date, year);

      if (isInvalid)
         printf("WARNING:  one or more fields are invalid.\n");
   }

   printf("exiting.\n");
   return 0;
}


//*****************************************************************************************//
// convertDateStringToVals()
//
// (c) 2010 by Jephthah and distributed for general use under the WTFPL licence
//
// purpose: converts each field of the datestring (month, day, year) into integer values.
//          fields may be separated by one or more possible token separator characters
//          invalid non-numeric fields will be converted to -1, and return an error flag
//          out-of-range values will be left intact but will still return an error flag
//
// input:   dateString, the character string containing a valid date in mm/dd/yyyy or similar format
//          tokenSeparators, pointer to list of one or more possible field separator characters
//
// output:  month, date, year:  integer values representing each field.  the year, if less than 100,
//          will be converted to corresponding year 2000-2099, otherwise will be left intact.
//
// return:  0 if all fields were valid, -1 if one or more fields are invalid
//
//*****************************************************************************************//
int convertDateStringToVals (char * dateString, const char * tokenSeparators, int * month, int * date, int * year)
{
   char *tempDate, *mmStr, *ddStr, *yyStr, *ptr;
   int  error = 0;

   tempDate = malloc(strlen(dateString) * sizeof(char));
   strcpy(tempDate,dateString);   // copy dateStr into temp location that can be abused by strtok

   *month = *date = *year = -1;   // assume error unless found valid values
   
   mmStr = strtok(tempDate, tokenSeparators);
   ddStr = strtok(NULL,    tokenSeparators);
   yyStr = strtok(NULL,    tokenSeparators);

   if (mmStr != NULL)   // found token
   {
      *month = strtol(mmStr,&ptr,10);
      if (ptr == mmStr)   
         *month = -1;   // if ptr didnt move, token was not valid numeric
      else if (*month < 1 || *month > 12)
         error = -1;    // value numeric, but out of range
   }

   if (ddStr != NULL)  // found token
   {
      *date = strtol(ddStr,&ptr,10);
      if (ptr == ddStr)
         *date = -1;      // if ptr didnt move, token was not valid numeric
      else if (*date < 1 || *date > 31)
         error = -1;      // value numeric, but out of range
   }

   if (yyStr != NULL)  // found token
   {
      *year = strtol(yyStr,&ptr,10);
      if (ptr == yyStr)    
         *year = -1;    // if ptr didnt move, token was not valid numeric
      else if (*year < 0)  
         error = -1;      // value numeric, but out of range
      else if (*year < 100)
         *year += 2000;   // assumes 2-digit year is 21st Century    
   }

   if (*month < 0 || *date < 0 || *year < 0)
      error = -1;

   free(tempDate);

   return error;   // will be zero (0) if all fields are valid, -1 if not.

}

//*****************************************************************************************//
// GetUserInput()
//
// (c) 2010 by Jephthah and distributed for general use under the WTFPL licence
//
// purpose:gets user input, removes the newline, passes back results up to maximum
//         number of characters, flushes remaining characters from input buffer,
//         returns the number of total characters entered by user
//
// input:  maxStringLength is the maximum allowble characters to input by user
//
// output: returnStr, contains up to the maximum length of characters allowed that were
//         input by the user, any additional characters entered are lost
//
// return: total characters entered by user; caller should check that this value is equal
//         to or less than the maximum allowed to indicate valid string was input.  larger
//         value returned than was allowed by the input indicates characters were lost
//
//*****************************************************************************************//
int getUserInput (char * returnStr, int maxStringLength)
{
   char    *tempStr;
   int     maxLen, totalCount = 0;
   size_t  len;

   maxLen = maxStringLength + 2;  //account for NULL and /newline
   tempStr = malloc(maxLen * sizeof(char));  //temporary holder

   do {
      fgets(tempStr, maxLen, stdin);  // get chars from input buffer
      len = strlen(tempStr);

      if (tempStr[len-1] == '\n')  // newline indicates end of string
      {
         tempStr[len-1] = '\0';   // delete it
         len = strlen(tempStr);   // and recalc length
      }
      totalCount += (int)len;
   }
   while ((int)len > maxStringLength);  // continue to flush extras if too long

   strcpy(returnStr,tempStr);     // copy temp string into output
   free(tempStr);              // and release memory

   return totalCount;   // may be more than the number allowed
}

.

Not entirely bullet proof since it will interpret my half a shopping list as a date

Enter Date String : 1)Fish.2)Chips.3
   month : 01
   date  : 02
   year  : 2003

Verifying input involves verifying no additional garbage was input too.

i like how some people seem to assume that they can somehow guarantee the format and content of strings being input by an end user or some other uncontrolled process.

the fact is, nobody ever knows for certain what will wind up being passed in as a string argument. data can be corrupted even if everything else is "guaranteed"

that's why you need to use strtol() to convert numbers, rather than the non-error-checking atol/atoi

and strtok() is powerful enough to be able to handle multiple types of token delimeters. so, when combined with even a modest amount of error checking, strtok and strtol can make a nearly bulletproof string-to-numeric converter.

You have brown eyes, don't you?
My input is bulletproof and I've never used any strt??() functions. Get off your high horse and accept the fact the there are other ways than your pet ways. I'm not saying you shouldn't use strt??() functions, I explained why I don't. You, in the other hand say you can't use ato?() which is crap.

We've been here before, J. I really don't need yet another PM apologizing for being hardheaded. Become a real programmer and embrace other opinions. Don't use them, but don't diss them either.

Not entirely bullet proof since it will interpret my half a shopping list as a date

Enter Date String : 1)Fish.2)Chips.3
   month : 01
   date  : 02
   year  : 2003

Verifying input involves verifying no additional garbage was input too.

Cowandchicken??

nearly bulletproof

"nearly" is the key word. Murphy's Law always prevails.

but thanks for pointing this out. I realized that it would allow non-numerics after a valid number, but at some point i just got tired of "bulletproofing"

The rest is left as an exercise for the reader.

:P

Cowandchicken??

If you mean was that string tailored to the code after I had examined it then yes.

However there is a serious but generalised point here, if the validation does not validate the exact format, with allowable variations, then you let in the possibility that the user manages to type in something that was actually meant to go in a different field but somehow manages to conform loose verification of the field they have typed into the program has unwittingly let in incorrect input.

If the software is in everyday use then the chances of this happening go up and if for some strange reason the cancel input function of the software doesn't work and the software acts on the input even if cancelled they it could be a disaster (don't laugh I've seen it happen).

Not checking that there are no invalid characters between the end of the parsed input and the next separator is sloppy.

ato?() which is crap.

well, i'm not saying it's crap, per se, it is good for a quick prototyping that doesnt require validation.

but atol/atoi is terribly weak and unable to distinguish between non-numerics and a legitimate zero value.

now if you need to preface your calls to ato? with some routine that identifies whether or not it's a number in the first place, you've kind of defeated the purpose of having an ascii/string-to-numeric converter, yes?

You have brown eyes, don't you? ... get off your high horse and ... Become a real programmer

really? is this where we're headed?

.

Is this nearly bulletproof?

/*  Does not verify date components (month <=12 etc), not 
    the purpose of  the task. 

    Handles 3 number fields (a date) separated by / - . and SPACE
 */
#include<stdio.h>
#include<ctype.h>
#include<stdlib.h>

int isSeparator(char ch)
{
    if (ch == '/')  return 1;
    if (ch == '-')  return 1;
    if (ch == '.')  return 1;
    if (ch == ' ')  return 1;
    if (ch == '\n') return 1;
    return 0;
}

int main()
{
    int date[3];
    int i;
    char buf[200] = {0};
    char *p;
    
    printf("Enter date: ");
    fgets(buf, 200, stdin);
    
    p = buf;
    i = 0;
    while (*p != '\n' && *p != 0 && i < 3 )
    {
        if (isdigit(*p))
        {
            date[i++] = atoi(p);
            while (isdigit(*p))  p++;
            if (isSeparator(*p)) p++;
              else              break;  // non-date character
        }
        else  break;                    // non-date character
    }
    if (i != 3) printf ("Illegal Date Format");
      else
      {
        printf("%2d/%02d/%04d ", date[0], date[1], date[2]);
      }
    return 0;
}
commented: Nice code :) +8

what is this mickey mouse bullshit, walt?

you go and pull my posts out of the middle of another thread, and use them to create a new thread in my name.... and then Double Post it, too?

you have too much time on your hands to be getting worked up into a state of such pettiness.

now if you need to preface your calls to ato? with some routine that identifies whether or not it's a number in the first place, you've kind of defeated the purpose of having an ascii/string-to-numeric converter, yes?

Not if I'm parsing a special format -- Date, Time, SKU, Phone#. To blindly pass an unknown string into a converter is bad form. Verify the syntax. Then you don't have problems converting.

yeah, well, okay. your code is great. i'll study it and learn. thank you sir.

meanwhile how about deleting the other double-post shenanigans you posted in my name.

kkthx.

what is this mickey mouse bullshit, walt?

you go and pull my posts out of the middle of another thread, and use them to create a new thread in my name.... and then Double Post it, too?

you have too much time on your hands to be getting worked up into a state of such pettiness.

No, I'm moving the useless crap out of a thread because it is of no use to the OP. Note I also left your post in that thread, too.

I am policing the threads as per my position. Sorry if your post was the first to be moved but the previous posts didn't warrant being part of the discussion.

And if it was pettiness, I'd just delete your posts and infract you. But instead I let the discussion continue away from the thread it was cluttering up. Not a good idea? Why not? Explain.

okay, i get it. your code is fine. it's concise, but it doesnt protect user input and it doesnt validate range.

here is my corrected version. i believe it's pretty solid

Banfa, good job on calling me out on being sloppy. i should have followed through. the correction is simple: change 3 lines in the ConvertDateStringToVals from

if (ptr == mmStr)

to

if (ptr == mmStr || ptr != &mmStr[strlen(mmStr)])

here is the NOW BULLETPROOF(*) version that i should have posted in the first place

(* Nearly)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
 
#define MAX_DATESTR_LEN                16
#define DATESTR_DIVIDER_CHARACTERS     "/\\-_.,"  // allowable delimiters for month date year

int convertDateStringToVals (char*, const char*, int*, int*, int*);
int getUserInput(char*, int);

int main(void)
{
   char  dateStr[MAX_DATESTR_LEN];
   int   inputLen = 0,
         month,
         date,
         year,
         isInvalid;

   while (1)
   {
      do {
         printf("\nEnter Date String : ");
         fflush(stdout);
         inputLen = getUserInput(dateStr,MAX_DATESTR_LEN);
         if (inputLen > MAX_DATESTR_LEN)
            printf("string length %d, max length is %d, try again.\n\n",inputLen, MAX_DATESTR_LEN);
      }
      while (inputLen > MAX_DATESTR_LEN);

      if (inputLen == 0)
         break;

      isInvalid = convertDateStringToVals(dateStr, DATESTR_DIVIDER_CHARACTERS, &month, &date, &year);
      printf("   month : %02d\n   date  : %02d\n   year  : %d\n", month, date, year);

      if (isInvalid)
         printf("WARNING:  one or more fields are invalid.\n");
   }

   printf("exiting.\n");
   return 0;
}


//*****************************************************************************************//
// convertDateStringToVals()
//
// (c) 2010 by Jephthah and distributed for general use under the WTFPL licence
//
// purpose: converts each field of the datestring (month, day, year) into integer values.
//          fields may be separated by one or more possible token separator characters
//          invalid non-numeric fields will be converted to -1, and return an error flag
//          out-of-range values will be left intact but will still return an error flag
//
// input:   dateString, the character string containing a valid date in mm/dd/yyyy or similar format
//          tokenSeparators, pointer to list of one or more possible field separator characters
//
// output:  month, date, year:  integer values representing each field.  the year, if less than 100,
//          will be converted to corresponding year 2000-2099, otherwise will be left intact.
//
// return:  0 if all fields were valid, -1 if one or more fields are invalid
//
//*****************************************************************************************//
int convertDateStringToVals (char * dateString, const char * tokenSeparators, int * month, int * date, int * year)
{
   char *tempDate, *mmStr, *ddStr, *yyStr, *ptr;
   int  error = 0;

   tempDate = malloc(strlen(dateString) * sizeof(char));
   strcpy(tempDate,dateString);   // copy dateStr into temp location that can be abused by strtok

   *month = *date = *year = -1;   // assume error unless found valid values
   
   mmStr = strtok(tempDate, tokenSeparators);
   ddStr = strtok(NULL,    tokenSeparators);
   yyStr = strtok(NULL,    tokenSeparators);

   if (mmStr != NULL)   // found token
   {
      *month = strtol(mmStr,&ptr,10);
      if (ptr == mmStr || ptr != &mmStr[strlen(mmStr)])
         *month = -1;   // token was not fully valid numeric
      else if (*month < 1 || *month > 12)
         error = -1;    // value numeric, but out of range
   }

   if (ddStr != NULL)  // found token
   {
      *date = strtol(ddStr,&ptr,10);
      if (ptr == ddStr || ptr != &ddStr[strlen(ddStr)])
         *date = -1;      // token was not fully valid numeric
      else if (*date < 1 || *date > 31)
         error = -1;      // value numeric, but out of range
   }

   if (yyStr != NULL)  // found token
   {
      *year = strtol(yyStr,&ptr,10);
      if (ptr == yyStr || ptr != &yyStr[strlen(yyStr)])
         *year = -1;       // token was not fully valid numeric
      else if (*year < 0)
         error = -1;       // value numeric, but out of range
      else if (*year < 100)
         *year += 2000;    // assume 2-digit year is 21st Century
   }

   if (*month < 0 || *date < 0 || *year < 0)
      error = -1;

   free(tempDate);

   return error;   // will be zero (0) if all fields are valid, -1 if not.

}

//*****************************************************************************************//
// GetUserInput()
//
// (c) 2010 by Jephthah and distributed for general use under the WTFPL licence
//
// purpose:gets user input, removes the newline, passes back results up to maximum
//         number of characters, flushes remaining characters from input buffer,
//         returns the number of total characters entered by user
//
// input:  maxStringLength is the maximum allowble characters to input by user
//
// output: returnStr, contains up to the maximum length of characters allowed that were
//         input by the user, any additional characters entered are lost
//
// return: total characters entered by user; caller should check that this value is equal
//         to or less than the maximum allowed to indicate valid string was input.  larger
//         value returned than was allowed by the input indicates characters were lost
//
//
// TYPICAL USE EXAMPLE:
//
//    do {
//       printf("\nEnter String : ");
//       fflush(stdout);
//       inputLen = getUserInput(inputStr, MAX_STR_LEN);
//       if (inputLen > MAX_STR_LEN)
//           printf("string length %d, max length is %d, try again.\n\n",inputLen, MAX_STR_LEN);
//    }
//    while (inputLen > MAX_STR_LEN);
//
//*****************************************************************************************//
int getUserInput (char * returnStr, int maxStringLength)
{
   char    *tempStr;
   int     maxLen, totalCount = 0;
   size_t  len;

   maxLen = maxStringLength + 2;     //account for NULL and /newline
   tempStr = malloc(maxLen * sizeof(char));  //temporary holder

   do {
      fgets(tempStr, maxLen, stdin);  // get chars from input buffer
      len = strlen(tempStr);

      if (tempStr[len-1] == '\n') // newline indicates end of string
      {
         tempStr[len-1] = '\0';   // delete it
         len = strlen(tempStr);   // and recalc length
      }
      totalCount += (int)len;
   }
   while ((int)len > maxStringLength);  // continue to flush extras if too long

   strcpy(returnStr,tempStr);  // copy temp string into output
   free(tempStr);              // and release memory

   return totalCount;   // may be more than the number allowed
}

.


.

Is this nearly bulletproof?

I think so probably.

I have to admit that I have a, possibly irrational, prejudice against atoi. I do not like the way you can tell how many characters it has parsed without parsing them yourself, as you code does. Every digit gets parsed once by atoi and once by your own loop to verify that it is a digit. That inefficiency, although minute, just sticks in my craw and makes me want to do something different. I do use atoi but normally only in simple little tools that are not destined for consumer contact.

I may be a bit prejudiced against atoi but wild horses could not force me to use strtok. It contains static buffers (well pointers I suspect) so is not re-entrant or thread-safe.

If I was writing this routine then I would either cut out the call to atoi and the while loop and put in a call to strtol or I would cut out the call to atoi and and expand the while loop to do the calculation of the value as well.

while (*p != '\n' && *p != 0 && i < 3 )
    {
        if (isdigit(*p))
        {
            char *end;
            date[i++] = strtol(p, &end, 10);
            if (isSeparator(*end)) p = end + 1;
              else              break;  // non-date character
        }
        else  break;                    // non-date character
    }
while (*p != '\n' && *p != 0 && i < 3 )
    {
        if (isdigit(*p))
        {
            date[i] = 0;
            while (isdigit(*p))
            {
                date[i] = (date[i] * 10) + (*p - '0');
                p++;
            }
            i++;
            if (isSeparator(*p)) p++;
              else              break;  // non-date character
        }
        else  break;                    // non-date character
    }

Actually I just realised nearly bullet proof it is because absolutely anything could appear after the last number (same goes for my routines since they are based on yours) once this routine has successfully parse 3 numbers it assume success no matter what follows.

If I was writing this routine then I would either cut out the call to atoi and the while loop and put in a call to strtol or I would cut out the call to atoi and and expand the while loop to do the calculation of the value as well.

Ideally, I completely agree with not using even atoi() , as your code demonstrates. When you write the parsing, you are in complete control. You aren't subject to the whims and potential errors in the unknown code from ato?() nor strt??() . This way you control all the bells and whistles -- and warts -- of the input. Converting numbers is trivial, so why use a predefined function anyway.

Actually I just realised nearly bullet proof it is because absolutely anything could appear after the last number (same goes for my routines since they are based on yours) once this routine has successfully parse 3 numbers it assume success no matter what follows.

I did that on purpose. Allows for 12/10/2005AD . Another trivial test can weed that out if you don't like it.

here is the NOW BULLETPROOF(*) version that i should have posted in the first place

(* Nearly)

char *tempDate, *mmStr, *ddStr, *yyStr, *ptr;
   int  error = 0;

   tempDate = malloc(strlen(dateString) * sizeof(char));
   strcpy(tempDate,dateString);   // copy dateStr into temp location that can be abused by strtok

.

Isn't that a classic mistake?

How is your preferred formatting different/better from these snippets?
http://www.daniweb.com/tutorials/tutorial45806.html

My thoughts on strtok here:
http://www.daniweb.com/code/snippet216569.html

Isn't that a classic mistake?

How is your preferred formatting different/better from these snippets?
http://www.daniweb.com/tutorials/tutorial45806.html

My thoughts on strtok here:
http://www.daniweb.com/code/snippet216569.html

your tutorial is very thorough. i wasnt aware or forgot about its existence. my code snippet use similar method and is certainly no better.

as for strtok, i doubt anyone here is getting user input from the console in a mulithreaded environment, but i probably do want to reconsider its general use since it's not reentrant-safe

not sure what you mean by classic mistake. the fact that i'm using strlen or sizeof as parameters for malloc? or that i didnt check the return pointer of malloc for possible memory error?

not sure what you mean by classic mistake. the fact that i'm using strlen or sizeof as parameters for malloc? or that i didnt check the return pointer of malloc for possible memory error?

That too.

char *tempDate, *mmStr, *ddStr, *yyStr, *ptr;
   tempDate = malloc([B]strlen(dateString)[/B] * sizeof(char));
   strcpy(tempDate,dateString);   // copy dateStr into temp location that can be abused by strtok

Q: My program is crashing, apparently somewhere down inside malloc, but I can't see anything wrong with it. Is there a bug in malloc?

A: It is unfortunately very easy to corrupt malloc's internal data structures, and the resulting problems can be stubborn. The most common source of problems is writing more to a malloc'ed region than it was allocated to hold; a particularly common bug is to malloc(strlen(s)) instead of strlen(s) + 1.

commented: very good to know. +6
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.