954,498 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Copy from strtok to 2D array

Hello.

I am trying to use the strtok function to separate individual words and store them in a 2D char array. However, only the first word is stored in the 2D array.

cout << "Enter a string: ";
        cin.getline (str, MAX);

        //count number to words
        char *p = strtok (str, " ");
        count = wordCount (p);

        cout << "No of words = " << count;
        cout << endl;

        //tokenize string and copy it to 2d array
        for (int i = 0; i < count; i++)
        {
            while (p != NULL)
            {
                strcpy ( strWords[i] , p);
                p = strtok (NULL, ".,? ;"); 
            }
        }

        for (int i = 0; i < count; i++)
        {
            cout << strWords[i] << " ";
            cout << endl;

        }
daldrome
Newbie Poster
18 posts since Oct 2011
Reputation Points: 10
Solved Threads: 0
 

Lines 12 - 19. Having a NESTED loop with strtok as an inner while loop defeats the whole purpose of strtok, particularly the way you have it. Suppose you have three words in the string. For i equals 0, the inner loop from lines 14 to 18 will execute 3 times. i will equal 0 all three times, so you'll copy a string to the same location three times, effectively overwriting twice. p then becomes NULL and stays NULL, so the inner loop will never execute for i greater than 0. How can it?

So one, why are you using a loop within a loop at all? Two, this particular nested loop makes no sense because it's really THIS...

for (int i = 0; i < 1; i++)
        {
            while (p != NULL)
            {
                strcpy ( strWords[i] , p);
                p = strtok (NULL, ".,? ;"); 
            }
        }


which means it's this...

while (p != NULL)
            {
                strcpy ( strWords[0] , p);
                p = strtok (NULL, ".,? ;"); 
            }
VernonDozier
Posting Expert
5,527 posts since Jan 2008
Reputation Points: 2,633
Solved Threads: 711
 
int i = 0;
        while (p != NULL && i < count)
        {
            strcpy ( strWords[i] , p);
            p = strtok (p, " ");
            ++i;
        }

If I change my lines to this, the first word is repeated many times. How do I get the strtok function to get the next word in the string?

daldrome
Newbie Poster
18 posts since Oct 2011
Reputation Points: 10
Solved Threads: 0
 
p = strtok (p, " "); // wrong


You had it right the first time, or at least the first parameter. NULL, not p.

p = strtok (NULL, ".,? ;"); // right
VernonDozier
Posting Expert
5,527 posts since Jan 2008
Reputation Points: 2,633
Solved Threads: 711
 
int i = 0;
        while (p != NULL && i < count)
        {
            strcpy (strWords[i] , p);
            p = strtok (NULL, ".,? ;");
            ++i;
        }

        for (int i = 0; i < count; i++)
        {
            cout << strWords[i] << " ";

        }
            cout << endl;

If I do this, only the first word gets copied to the first address of the array.
Thanks for your help so far

daldrome
Newbie Poster
18 posts since Oct 2011
Reputation Points: 10
Solved Threads: 0
 

As the loop starts on line 2, what is the value of p and count?
Where does the value of count come from?
Have you called strtok() and assigned the return value to p before the loop on line 2 starts?

Lerner
Nearly a Posting Maven
2,382 posts since Jul 2005
Reputation Points: 739
Solved Threads: 396
 
As the loop starts on line 2, what is the value of p and count? Where does the value of count come from? Have you called strtok() and assigned the return value to p before the loop on line 2 starts?


Here's my complete code.

#include <iostream>
#include <cstring>
#include <cstdlib>

using namespace std;

const int MAX = 100;
const int LEN = 100;

int wordCount (const char *);

int main()
{
    char str [MAX];
    static char strWords [MAX][LEN];
    char ch; //to store user response
    int count = 0; //count number of words in string


    do
    {
        cout << "Enter a string: ";
        cin.getline (str, MAX);

        //count number to words
        char *p = strtok (str, " ");
        count = wordCount (p);

        cout << "No of words = " << count;
        cout << endl;

        //tokenize string and copy it to 2d array
        int i = 0;
        while (p != NULL && i < count)
        {
            strcpy (strWords[i] , p);
            p = strtok (NULL, ".,? ;");
            ++i;
        }

        for (int i = 0; i < count; i++)
        {
            cout << strWords[i] << " ";

        }
            cout << endl;
        cout << "Continue (Y/N): ";
        cin >> ch;
        cin.clear();
        cin.ignore (MAX, '\n');
        cout << "=====================================================" << endl;

    }while (ch == 'y'|| ch == 'Y') ;

}

int wordCount (const char *p)
{
    int count;
    while (p != NULL)
        {
            p = strtok (NULL, ".,? ;"); //move to next available token and ignore all delimiter
            ++count;
        }
    return count;
}
daldrome
Newbie Poster
18 posts since Oct 2011
Reputation Points: 10
Solved Threads: 0
 

wordCount() is ruining the string for later processing. Always remember that strtok() modifies the string by writing '\0' at appropriate places. What about merging the two tasks of counting words and saving them to the array?

#include <iostream>
#include <cstring>

using namespace std;

namespace
{
    const int MAX = 100;
    const int LEN = 100;
    const char* fmt = ".,? ;";
}

int main()
{
    char strWords[MAX][LEN];
    char str[LEN];
    
    cout << "Enter a string: ";
    cin.getline(str, LEN);
    
    int count = 0;
    
    for (char *p = strtok(str, fmt); count < MAX && p; p = strtok(0, fmt))
    {
        strcpy(strWords[count++], p);
    }
    
    cout << "No of words = " << count << endl;
    
    for (int i = 0; i < count; i++)
    {
        cout << strWords[i] << endl;
    }
}
deceptikon
Indubitably
Administrator
632 posts since Jan 2012
Reputation Points: 119
Solved Threads: 105
 

wordCount() is ruining the string for later processing. Always remember that strtok() modifies the string by writing '\0' at appropriate places. What about merging the two tasks of counting words and saving them to the array?

#include <iostream>
#include <cstring>

using namespace std;

namespace
{
    const int MAX = 100;
    const int LEN = 100;
    const char* fmt = ".,? ;";
}

int main()
{
    char strWords[MAX][LEN];
    char str[LEN];
    
    cout << "Enter a string: ";
    cin.getline(str, LEN);
    
    int count = 0;
    
    for (char *p = strtok(str, fmt); count < MAX && p; p = strtok(0, fmt))
    {
        strcpy(strWords[count++], p);
    }
    
    cout << "No of words = " << count << endl;
    
    for (int i = 0; i < count; i++)
    {
        cout << strWords[i] << endl;
    }
}

Thanks alot for your help! I did not realise that my wordCount function was destroying the string

daldrome
Newbie Poster
18 posts since Oct 2011
Reputation Points: 10
Solved Threads: 0
 

Just one final question for my understanding.
Can anyone explain to me what the arguments in the for loop?

for (char *p = strtok(str, fmt); count < MAX && p; p = strtok(0, fmt))
    {
        strcpy(strWords[count++], p);
    }
daldrome
Newbie Poster
18 posts since Oct 2011
Reputation Points: 10
Solved Threads: 0
 
for (char *p = strtok(str, fmt); count < MAX && p; p = strtok(0, fmt))
    {
        strcpy(strWords[count++], p);
    }


Line 3 turns into this...

strcpy(strWords[count], p);
count++;


so we have...

for (char *p = strtok(str, fmt); count < MAX && p; p = strtok(0, fmt))
    {
        strcpy(strWords[count], p);
        count++;
    }


[/code]
char *p = strtok(str, fmt)
[/code]

Finds the first token in str and make p point to it.

count < MAX && p


is the same as...

count < MAX && p != NULL


Means "keep going until either count >= MAX or p is NULL". If p is NULL, then all tokens have been found... http://www.cplusplus.com/reference/clibrary/cstring/strtok/
Return Value
A pointer to the last token found in string.
A null pointer is returned if there are no tokens left to retrieve.

p = strtok(0, fmt)


is the same as...

p = strtok(NULL, fmt)


means "move forward till you find the next token and make p point to it."Parameters
str
C string to truncate. The contents of this string are modified and broken into smaller strings (tokens).
Alternativelly, a null pointer may be specified, in which case the function continues scanning where a previous successful call to the function ended.
delimiters
C string containing the delimiters.
These may vary from one call to another.


See the link. They have a good explanation and an example.

VernonDozier
Posting Expert
5,527 posts since Jan 2008
Reputation Points: 2,633
Solved Threads: 711
 

Can't seem to edit post. I screwed up a code tag.

VernonDozier
Posting Expert
5,527 posts since Jan 2008
Reputation Points: 2,633
Solved Threads: 711
 

Just one final question for my understanding. Can anyone explain to me what the arguments in the for loop?

for (char *p = strtok(str, fmt); count < MAX && p; p = strtok(0, fmt))
    {
        strcpy(strWords[count++], p);
    }


strtok() has two steps: set the source string/return the first token, and return subsequent tokens on the source string. Because strtok() stores a pointer to the source string internally, any subsequent calls after the first must have a source string of NULL. The pattern looks like this:

char* tok;

tok = strtok(source, delim);

while (tok != NULL)
{
    // Use tok

    tok = strtok(NULL, delim);
}

The first thing I did was merge that into a for loop to make the unusual behavior of strtok() more clear:

for (tok = strtok(source, delim); tok != NULL; tok = strtok(NULL, delim))
{
    // Use tok
}

Then I replaced NULL with 0 because that's the convention in C++. In the new standard, nullptr is recommended. I also removed the redundant test against NULL in the loop condition because it happens implicitly just by using tok. Finally, I defined tok as local to the for loop by declaring it in the initialization clause:

for (char* tok = strtok(source, delim); tok; tok = strtok(0, delim))
{
    // Use tok
}

That's for the use of strtok(). The extra test of count < MAX just makes sure that the loop doesn't try to call strcpy() on an index that doesn't exist if there are more tokens than slots to hold them.

deceptikon
Indubitably
Administrator
632 posts since Jan 2012
Reputation Points: 119
Solved Threads: 105
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You