I'm working on a C++ module to read wavefront OBJ files; these are text files with many lines with either 2, 3, or more integer or floating point numbers; separated by spaces or tabs. It's not always possible to know how many numbers there will be in a single line without processing it. The C++ stringstream >> float technique is well suited to this, but it is painfully slow compared to the atof; strtod, possibly even sscanf methods ( the atof method below, even with extra whitespace processing, is 5x faster than stringstream >> float ) ... my problem is, im having to use the 'extra whitespace processing', because none of the C methods seem to show any good indication of whether a number was read, and if so, how much of the string was converted.

My question is mainly about strtod; since it's the only one of those that looks like it could do this, because it lets one pass in a pointer to a char pointer, and that's supposed to point to the 'rest' of the string after a number's been extracted.. however, what happens when it fails? is the value of the char ** passed in as the last argument defined after a failed operation?

I want to be able to say something like:

if( ! pullfloat(str, x ) ) //error;
if( ! pullfloat(str, y ) ) //error;
if( ! pullfloat(str, z ) ) //error;

but also something like:

while( pullfloat(str, f ) ) //dosomething;

I'm using this code atm; but it seems hyper-redundant, since whatever hapens in the loop, I still do an atof, and atof does the same whitespace detection as I'm doing.. any ideas on how I can cut out the whitespace detection, but still get the same information?

int pullfloat( const char ** str_pp, float & f )
{
  int started = false;
  const char * init = *str_pp;
  for( const char * str = init; *str != '\0'; str++, ( *str_pp )++ )
  {
    char c = *str;
    if( ! started ) started = ( ( c != ' ' ) || ( c != '\t') );
    else if( ( c == ' ' ) || ( c == '\t' ) ) break;
  }
  if( started )
  {
    f = atof( init );
    return true;
  }
  else return false;
}

Recommended Answers

All 10 Replies

Actually, ignore that. I realised it's not as simple as just looking at whitespace, because in the parts where there IS an arbitrary number of numbers, they are delimited like this:

1/1/1 2/2/4 3/6/4 8/0/1

or sometimes even like this:

1//3 2//4 5//7 etc..

back to the drawing board.. o_O

EDIT: it would still be useful to know if the value of the char ** passed in as the last argument of strtod is defined if a number can't be extracted...?

If the first sequence of non-whitespace characters in str does not form a valid floating-point number as just defined, or if no such sequence exists because either str is empty or contains only whitespace characters, no conversion is performed

http://www.cplusplus.com/reference/clibrary/cstdlib/strtod.html

From the above I don't think the value of the third parameter to strtod() is changed. But the easiest way to find out is to write a small test program and try it out.

I've seen that page, I have it open in a tab already =P The fact it doesn't mention a change is why I was wondering.

The problem with writing a small program to test it out, is that a positive result won't necessarily tell me that it is well defined - it might work on my machine, but not work elsewhere. The best I could prove is that it doesn't work atall... Still; I suppose that's something. I will give it a try, but if it does work, unless I can test it on other machines, or see it in use, or see some document saying it will do that always, I won't feel happy using it.

Actually, no change is indication that it has failed; since it wouldn't be able to convert a lengthless string to a number.. If it pointed the pointer to something random on failure, that wouldn't be useful.. but I don't think it's very likely to do that...

>> it might work on my machine, but not work elsewhere
The behavior has nothing to do with what machine the compiled program is run on. Since strtod() is in C89 standards the function will behave the same on every modern C compiler.

Well.. I thought, perhaps, an implementation of strtod might use that pointer to look ahead through the string, in which case, it would get moved forwards even if a number wasn't 'consumed'... Thinking about it right now though, I don't think that's likely, it would be quite silly if that was the case.

Surely any individual apect of the whole behaviour is only guaranteed to work on multiple machines identically if that part of the behaviour's actually defined in the standard? I mean, if it wasn't explicitly in the standard that 'the pointer will be unchanged on failure', implementors of the cstdlib library could be free to implement the internal workings of the function it in their own way [ like the silly way above for example ]; or, is the standard totally rigid - to the level of a mandate for the code in each function rather than a specification for each function?

>>or, is the standard totally rigid - to the level of a mandate for the code in each function rather than a specification for each function?
The standards dictate the specifications, what the function is supposed to do. I'm not about to pay $289.00 USD for a copy of the standards so I can not verify exactly what it says or does not say about that function.

>I mean, if it wasn't explicitly in the standard that 'the pointer will be unchanged on failure',
>implementors of the cstdlib library could be free to implement the internal workings of the
>function it in their own way
True, but the behavior in this case is carefully specified by the standard. strtod breaks the source string into three parts: leading whitespace (possibly empty), a legal floating-point representation, and trailing unmatched characters (including '\0'). Provided the second argument isn't a null pointer, it's guaranteed to be set to the first character in the third part. If the second part is empty (ie. strtod completely failed to match a floating-point value), the second argument is set to the source string.

You can test these rules fairly easily:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>

const char *check_strtod ( const char *s )
{
  const char *end;

  printf ( "%f\n", strtod ( s, &end ) );

  return end;
}

int main ( void )
{
  const char *p;

  /* Full match, end points to '\0' */
  assert ( *check_strtod ( "123.456" ) == '\0' );

  /* Full match after leading whitespace, end points to '\0' */
  assert ( *check_strtod ( " \r\n\t\v\f123.456" ) == '\0' );

  /* Partial match, end points to unmatched string */
  assert ( strcmp ( check_strtod ( "123.456abcdef" ), "abcdef" ) == 0 );

  /* Partial match after leading whitespace, end points to unmatched string */
  assert ( strcmp ( check_strtod ( "   123.456abcdef" ), "abcdef" ) == 0 );

  /* Complete failure, end points to the source string */
  assert ( *check_strtod ( " \r\n\t\v\fabcdef" ) == ' ' );

  return 0;
}
commented: Thankyou, that put my mind at rest. +4

Well; it's good to know that it will behave like this consistantly. I can't actually use it in the situation with the arbitrary number of numbers; since, as I mentioned, I forgot that I have to deal with blocks of numbers with slashes et al, but, I'm gonna use it for the fixed count space delimited data, it's very fast compared to iostreams, and can detect errors like missing numbers/spaces. My pullfloat is like this now.. neater, less redundant, seems to work correctly.

#include <stdlib.h>
#include <stdio.h>

int pullfloat( char ** p_str, double & f )
{
  char * str = *p_str;
  char ** check = &str;
  f = strtod( str, p_str );
  return( *check != *p_str );
}

int main( void )
{
  char * str = "1 2 3 456.0002 789       76.32      99999.888      3662 \n 1535 \t\t \n 1159 1593 1.24e+2 ";
  char ** p_str = &str;
  double f = 0.0;
  while( pullfloat( p_str, f ) ) printf( "Float: %f\n", f );
}

$289.00 USD for the standard documents? That's expensive for something like that..

>$289.00 USD for the standard documents? That's expensive for something like that..
That's only for hardcopy from ISO. Wiley published a book covering C99 for $60 if you like having something to flip through, and you can also buy the electronic version from ISO for $18. And of course, the drafts are often close enough and completely free online.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.