So Im trying to write some code that takes a chunk of text, splits it up into words, then puts these words into an array. What I have is:

string buf;
stringstream ss(database); // database is my chunk of text

    while (ss >> buf) // Tokenize when whitespace met

So in that while loop I want to take the string "buf" and put it into an array. Thus the result is an array of strings containing every word in the paragraph. This is where I feel like an idiot.

I have tried a few ways using string streams and converting strings to char arrays etc, but nothing will work for me. I have a feeling that this has to do with multidimensional arrays and pointers, but my searches have been fruitless.

Thanks in advance!

9 Years
Discussion Span
Last Post by hbweb500
#include <vector>
#include <boost/tokenizer.hpp>
#include <string>
#include <vector>
#include <iostream>
using namespace std;
using namespace boost;

int main()
  string str = "The boost tokenizer library provides a flexible "
                       "and easy to use way to break of a string or other "
                       "character sequence into a series of tokens.\n"
                       "Here is a simple example that will break up a "
                       "paragraph into words.\n";
  tokenizer<> toker( str );
  vector<string> words( toker.begin(), toker.end() ) ;
  cout << "there are " << words.size() << " words\n" ;
  cout << "words[6]: " << words[6] << '\n' ;

What also complicates matters is if your words are separated by more than one space, or even tabs.

In that case, it would probably be wiser to use regular expressions. The task becomes almost trivial.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.