So Im trying to write some code that takes a chunk of text, splits it up into words, then puts these words into an array. What I have is:

string buf;
stringstream ss(database); // database is my chunk of text

    
    while (ss >> buf) // Tokenize when whitespace met
    {
          ???
    }

So in that while loop I want to take the string "buf" and put it into an array. Thus the result is an array of strings containing every word in the paragraph. This is where I feel like an idiot.

I have tried a few ways using string streams and converting strings to char arrays etc, but nothing will work for me. I have a feeling that this has to do with multidimensional arrays and pointers, but my searches have been fruitless.

Thanks in advance!

My advice, use the vector container. One of the advantages of using C++.

#include <vector>
#include <boost/tokenizer.hpp>
#include <string>
#include <vector>
#include <iostream>
using namespace std;
using namespace boost;

int main()
{
  string str = "The boost tokenizer library provides a flexible "
                       "and easy to use way to break of a string or other "
                       "character sequence into a series of tokens.\n"
                       "Here is a simple example that will break up a "
                       "paragraph into words.\n";
  tokenizer<> toker( str );
  vector<string> words( toker.begin(), toker.end() ) ;
  cout << "there are " << words.size() << " words\n" ;
  cout << "words[6]: " << words[6] << '\n' ;
}
Member Avatar for iamthwee

What also complicates matters is if your words are separated by more than one space, or even tabs.

In that case, it would probably be wiser to use regular expressions. The task becomes almost trivial.

This is perfect, vijayan, thanks a bunch!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.