Hi all,

I am using the following function which I found online to trim the leading and trailing whitespaces from a passed in string. It works fine for all the cases but I am having trouble in understanding the process being a newbie to C world.

char *trim (char *str)
{
      char *ibuf, *obuf;

      if (str)
      {
	    for (ibuf = obuf = str; *ibuf; )
	    {
		  while (*ibuf && (isspace (*ibuf)))
			ibuf++;
		  if (*ibuf && (obuf != str))
                        *(obuf++) = ' ';
                  while (*ibuf && (!isspace (*ibuf)))
                        *(obuf++) = *(ibuf++);
            }
	    *obuf = '\0';
      }
      return (str);
}

Could someone please explain the above logic in detail.

Thanks ,
Scott.

Recommended Answers

All 7 Replies

>>assignment makes integer from pointer without a cast

I don't get that error when I compile it with vc++ 2008 Express.

I had that error message when *obuf = NULL opposed to *obuf = '\0'

I fixed that but forgot to change the title of this thread.

The main thing is I am looking for an explanation of this process step by step if possible. That would help me a lot.

Thanks,
Scott.

It works, and may be efficient, but difficult to explain. Here is an easier version to follow

char *trim (char *str)
{
    char *ibuf = NULL, *obuf = NULL;
    if( str != NULL)
    {
        ibuf = str;
        // find first non-space character
        while( isspace(*ibuf) )
            ibuf++;
        // anything to do?
        if( ibuf > str)
        {
            // shift everything left to fill up
            // leading spaces
            memmove(str,ibuf, strlen(ibuf)+1);
        }
        // find last character
        obuf = ibuf = str + strlen(str) - 1;
        // back up to first non-space character
        while( isspace(*obuf) )
            --obuf;
        // if there were trailing spaces
        if( obuf != ibuf )
            *(obuf+1) = 0;
    }
    return (str);
}

the KEY POINT to understanding is this: it uses an "input pointer" (ibuf) to read the characters of the original string, and an "output pointer" (obuf) that will selectively overwrite the original string even as the input pointer is reading the same string.

One other important effect of the program to note is that this not only removes leading and trailing whitespace, but it also condenses any repeated whitespace within the string to just one space character. E.g., if you have

word \t\n  word   
^^^    ^    ^^    ^^^     (note: \t and \n are <tab> and <newline>)

it will turn it into

word word
    ^

now, I'll take a stab at splainin' the code. i think its pretty easy to understand once you wrap your head around it.

first, understand the for loop ... the logical control in the "for" loop accomplishes two tasks

--- (1) one time only, "for loop" assigns two temp pointers: "input" (ibuf) and "output" (obuf) to point to the start of the original string, as pointed to by "str", which was originally passed into the function itself and point to the actual memory location of the string.

--- (2) "for loop" will continue to execute its block of code as long as (*ibuf) is true. this means, as long as the "input pointer" is pointing to a location of the string that contains some kind of character (even a space)... just not a NULL (0)

(Opinion: this could have just as easily been done with a "while" loop, after assigning the pointers in a more straightforward manner. i think the "for" loop is a bit of obfuscation...)


the block of code executed by the "for" loop, as long as the condition described is true, has three components that are executed sequentially before returning control to the for loop, which will determine if another loop is to be run.

--- (1) a "while" loop that will continue to shift the "input pointer" across the string as long as it continues to find whitespace.

--- (2) a single "if" that will write a single space character to the location pointed to by the "output pointer", if and only if the "output pointer" is no longer pointing to the same location as the original string's pointer (str) ....

Note that on the first go-round, the "output pointer" (obuf) is still pointing to the same location as "str" (i.e., the beginning of the string), so it does nothing.

Note that on subsequent go-rounds, the output pointer will no longer be the same as "str". Therefore as long as the input pointer is still pointing to a non-NULL character, it will write one single space character for each stretch of one or more internal whitespaces the input pointer encountered from the "while" loop in (1) above

--- (3) a "while" loop that will -- as long as the input pointer is pointing to a non-whitespace (and non-NULL) character -- overwrite the character at the original string currently pointed to by the "output pointer" with the character pointed to by the input pointer". and continue to do so while incrementing both the input pointer and the output pointer one character at a time, as long as the input pointer points to a character that is NOT a space.


.

That's trim w/o side effect (spaces condensing):

char* trim(char* str)
{
    if (str) {
        char *ibuf = str, *obuf = str, *sbuf = str;
        while (isspace(*ibuf))
            ibuf++;
        while (*ibuf) {
            *obuf++ = *ibuf;
            if (!isspace(*ibuf++))
                sbuf = obuf;
        }
        *sbuf = '\0';
    }
    return str;
}

I think after jephthah's post you can explain us how this code works ;)

holy crap did i post all that? that's ridiculous. i need to get a hobby.


.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.