| | |
extracting an arbitrary number of numbers from a string
Please support our C advertiser: Programming Forums - DaniWeb Sister Site
![]() |
I'm working on a C++ module to read wavefront OBJ files; these are text files with many lines with either 2, 3, or more integer or floating point numbers; separated by spaces or tabs. It's not always possible to know how many numbers there will be in a single line without processing it. The C++ stringstream >> float technique is well suited to this, but it is painfully slow compared to the atof; strtod, possibly even sscanf methods ( the atof method below, even with extra whitespace processing, is 5x faster than stringstream >> float ) ... my problem is, im having to use the 'extra whitespace processing', because none of the C methods seem to show any good indication of whether a number was read, and if so, how much of the string was converted.
My question is mainly about strtod; since it's the only one of those that looks like it could do this, because it lets one pass in a pointer to a char pointer, and that's supposed to point to the 'rest' of the string after a number's been extracted.. however, what happens when it fails? is the value of the char ** passed in as the last argument defined after a failed operation?
I want to be able to say something like:
but also something like:
I'm using this code atm; but it seems hyper-redundant, since whatever hapens in the loop, I still do an atof, and atof does the same whitespace detection as I'm doing.. any ideas on how I can cut out the whitespace detection, but still get the same information?
My question is mainly about strtod; since it's the only one of those that looks like it could do this, because it lets one pass in a pointer to a char pointer, and that's supposed to point to the 'rest' of the string after a number's been extracted.. however, what happens when it fails? is the value of the char ** passed in as the last argument defined after a failed operation?
I want to be able to say something like:
C Syntax (Toggle Plain Text)
if( ! pullfloat(str, x ) ) //error; if( ! pullfloat(str, y ) ) //error; if( ! pullfloat(str, z ) ) //error;
but also something like:
C Syntax (Toggle Plain Text)
while( pullfloat(str, f ) ) //dosomething;
I'm using this code atm; but it seems hyper-redundant, since whatever hapens in the loop, I still do an atof, and atof does the same whitespace detection as I'm doing.. any ideas on how I can cut out the whitespace detection, but still get the same information?
C Syntax (Toggle Plain Text)
int pullfloat( const char ** str_pp, float & f ) { int started = false; const char * init = *str_pp; for( const char * str = init; *str != '\0'; str++, ( *str_pp )++ ) { char c = *str; if( ! started ) started = ( ( c != ' ' ) || ( c != '\t') ); else if( ( c == ' ' ) || ( c == '\t' ) ) break; } if( started ) { f = atof( init ); return true; } else return false; }
Plato forgot the nullahedron..
Actually, ignore that. I realised it's not as simple as just looking at whitespace, because in the parts where there IS an arbitrary number of numbers, they are delimited like this:
1/1/1 2/2/4 3/6/4 8/0/1
or sometimes even like this:
1//3 2//4 5//7 etc..
back to the drawing board.. o_O
EDIT: it would still be useful to know if the value of the char ** passed in as the last argument of strtod is defined if a number can't be extracted...?
1/1/1 2/2/4 3/6/4 8/0/1
or sometimes even like this:
1//3 2//4 5//7 etc..
back to the drawing board.. o_O
EDIT: it would still be useful to know if the value of the char ** passed in as the last argument of strtod is defined if a number can't be extracted...?
Last edited by MattEvans; Sep 19th, 2007 at 1:48 am.
Plato forgot the nullahedron..
•
•
•
•
If the first sequence of non-whitespace characters in str does not form a valid floating-point number as just defined, or if no such sequence exists because either str is empty or contains only whitespace characters, no conversion is performed
From the above I don't think the value of the third parameter to strtod() is changed. But the easiest way to find out is to write a small test program and try it out.
Last edited by Ancient Dragon; Sep 19th, 2007 at 2:51 am.
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
I've seen that page, I have it open in a tab already =P The fact it doesn't mention a change is why I was wondering.
The problem with writing a small program to test it out, is that a positive result won't necessarily tell me that it is well defined - it might work on my machine, but not work elsewhere. The best I could prove is that it doesn't work atall... Still; I suppose that's something. I will give it a try, but if it does work, unless I can test it on other machines, or see it in use, or see some document saying it will do that always, I won't feel happy using it.
The problem with writing a small program to test it out, is that a positive result won't necessarily tell me that it is well defined - it might work on my machine, but not work elsewhere. The best I could prove is that it doesn't work atall... Still; I suppose that's something. I will give it a try, but if it does work, unless I can test it on other machines, or see it in use, or see some document saying it will do that always, I won't feel happy using it.
Plato forgot the nullahedron..
>> it might work on my machine, but not work elsewhere
The behavior has nothing to do with what machine the compiled program is run on. Since strtod() is in C89 standards the function will behave the same on every modern C compiler.
The behavior has nothing to do with what machine the compiled program is run on. Since strtod() is in C89 standards the function will behave the same on every modern C compiler.
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
Well.. I thought, perhaps, an implementation of strtod might use that pointer to look ahead through the string, in which case, it would get moved forwards even if a number wasn't 'consumed'... Thinking about it right now though, I don't think that's likely, it would be quite silly if that was the case.
Surely any individual apect of the whole behaviour is only guaranteed to work on multiple machines identically if that part of the behaviour's actually defined in the standard? I mean, if it wasn't explicitly in the standard that 'the pointer will be unchanged on failure', implementors of the cstdlib library could be free to implement the internal workings of the function it in their own way [ like the silly way above for example ]; or, is the standard totally rigid - to the level of a mandate for the code in each function rather than a specification for each function?
Surely any individual apect of the whole behaviour is only guaranteed to work on multiple machines identically if that part of the behaviour's actually defined in the standard? I mean, if it wasn't explicitly in the standard that 'the pointer will be unchanged on failure', implementors of the cstdlib library could be free to implement the internal workings of the function it in their own way [ like the silly way above for example ]; or, is the standard totally rigid - to the level of a mandate for the code in each function rather than a specification for each function?
Plato forgot the nullahedron..
>>or, is the standard totally rigid - to the level of a mandate for the code in each function rather than a specification for each function?
The standards dictate the specifications, what the function is supposed to do. I'm not about to pay $289.00 USD for a copy of the standards so I can not verify exactly what it says or does not say about that function.
The standards dictate the specifications, what the function is supposed to do. I'm not about to pay $289.00 USD for a copy of the standards so I can not verify exactly what it says or does not say about that function.
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
>I mean, if it wasn't explicitly in the standard that 'the pointer will be unchanged on failure',
>implementors of the cstdlib library could be free to implement the internal workings of the
>function it in their own way
True, but the behavior in this case is carefully specified by the standard. strtod breaks the source string into three parts: leading whitespace (possibly empty), a legal floating-point representation, and trailing unmatched characters (including '\0'). Provided the second argument isn't a null pointer, it's guaranteed to be set to the first character in the third part. If the second part is empty (ie. strtod completely failed to match a floating-point value), the second argument is set to the source string.
You can test these rules fairly easily:
>implementors of the cstdlib library could be free to implement the internal workings of the
>function it in their own way
True, but the behavior in this case is carefully specified by the standard. strtod breaks the source string into three parts: leading whitespace (possibly empty), a legal floating-point representation, and trailing unmatched characters (including '\0'). Provided the second argument isn't a null pointer, it's guaranteed to be set to the first character in the third part. If the second part is empty (ie. strtod completely failed to match a floating-point value), the second argument is set to the source string.
You can test these rules fairly easily:
C Syntax (Toggle Plain Text)
#include <assert.h> #include <stdio.h> #include <stdlib.h> const char *check_strtod ( const char *s ) { const char *end; printf ( "%f\n", strtod ( s, &end ) ); return end; } int main ( void ) { const char *p; /* Full match, end points to '\0' */ assert ( *check_strtod ( "123.456" ) == '\0' ); /* Full match after leading whitespace, end points to '\0' */ assert ( *check_strtod ( " \r\n\t\v\f123.456" ) == '\0' ); /* Partial match, end points to unmatched string */ assert ( strcmp ( check_strtod ( "123.456abcdef" ), "abcdef" ) == 0 ); /* Partial match after leading whitespace, end points to unmatched string */ assert ( strcmp ( check_strtod ( " 123.456abcdef" ), "abcdef" ) == 0 ); /* Complete failure, end points to the source string */ assert ( *check_strtod ( " \r\n\t\v\fabcdef" ) == ' ' ); return 0; }
I'm here to prove you wrong.
Well; it's good to know that it will behave like this consistantly. I can't actually use it in the situation with the arbitrary number of numbers; since, as I mentioned, I forgot that I have to deal with blocks of numbers with slashes et al, but, I'm gonna use it for the fixed count space delimited data, it's very fast compared to iostreams, and can detect errors like missing numbers/spaces. My pullfloat is like this now.. neater, less redundant, seems to work correctly.
$289.00 USD for the standard documents? That's expensive for something like that..
C Syntax (Toggle Plain Text)
#include <stdlib.h> #include <stdio.h> int pullfloat( char ** p_str, double & f ) { char * str = *p_str; char ** check = &str; f = strtod( str, p_str ); return( *check != *p_str ); } int main( void ) { char * str = "1 2 3 456.0002 789 76.32 99999.888 3662 \n 1535 \t\t \n 1159 1593 1.24e+2 "; char ** p_str = &str; double f = 0.0; while( pullfloat( p_str, f ) ) printf( "Float: %f\n", f ); }
$289.00 USD for the standard documents? That's expensive for something like that..
Plato forgot the nullahedron..
![]() |
Similar Threads
- Binary File IO (C#)
- the maximum number from three numbers. (Assembly)
- Extracting Numbers (C)
- check number is string (C)
- Finding the first number char in a string (Java)
- How to "seperate" numbers from String? (Java)
- extracting the middle number of 3 inputs (C++)
- Floating point numbers (C++)
- Help with random number gen (C++)
- string to integer array transformation (C)
Other Threads in the C Forum
- Previous Thread: calling function issue
- Next Thread: Homework help
| Thread Tools | Search this Thread |
* ansi api append array arrays bash binarysearch calculate centimeter changingto char character convert copyanyfile copypdffile creafecopyofanytypeoffileinc createcopyoffile createprocess() dynamic execv fflush file floatingpointvalidation fork forloop frequency function getlogicaldrivestrin givemetehcodez grade graphics gtkwinlinux histogram homework i/o ide inches include infiniteloop initialization input intmain() iso keyboard km license linked linkedlist linux list looping loopinsideloop. lowest matrix microsoft multi mysql oddnumber open opendocumentformat openwebfoundation overwrite pdf pointer pointers posix power program programming pyramidusingturboccodes radix read recursion recv recvblocked reversing scanf scheduling segmentationfault send shape single socketprogramming stack standard strchr string strings suggestions test testautomation threads unix urboc user variable whythiscodecausesegmentationfault win32api windowsapi






