The forum is buzzing with questions about tokenizing a C++ string. This is a short function that does it and returns a vector of the tokens. As an option, the delimiter string can represent either a multichar delimiter or a collection of single char delimiters:

// multichar delimiter == "^^"

"aa^^bb^c^^d"

becomes

"aa"
"bb^c"
"d"
// single char delimiter list == "^,"

"aa^^b,b^c^^d"

becomes

"aa"
""
"b"
"b"
"c"
""
"d"

A test driver is not included because that confused people with my last snippet.

#include <string>
#include <vector>

namespace Daniweb
{
    using namespace std;

    typedef string::size_type (string::*find_t)(const string& delim, 
                                                string::size_type offset) const;

    /// <summary>
    /// Splits the string s on the given delimiter(s) and
    /// returns a list of tokens without the delimiter(s)
    /// </summary>
    /// <param name=s>The string being split</param>
    /// <param name=match>The delimiter(s) for splitting</param>
    /// <param name=removeEmpty>Removes empty tokens from the list</param>
    /// <param name=fullMatch>
    /// True if the whole match string is a match, false
    /// if any character in the match string is a match
    /// </param>
    /// <returns>A list of tokens</returns>
    vector<string> Split(const string& s,
                         const string& match,
                         bool removeEmpty=false,
                         bool fullMatch=false)
    {
        vector<string> result;                 // return container for tokens
        string::size_type start = 0,           // starting position for searches
                          skip = 1;            // positions to skip after a match
        find_t pfind = &string::find_first_of; // search algorithm for matches

        if (fullMatch)
        {
            // use the whole match string as a key
            // instead of individual characters
            // skip might be 0. see search loop comments
            skip = match.length();
            pfind = &string::find;
        }

        while (start != string::npos)
        {
            // get a complete range [start..end)
            string::size_type end = (s.*pfind)(match, start);

            // null strings always match in string::find, but
            // a skip of 0 causes infinite loops. pretend that
            // no tokens were found and extract the whole string
            if (skip == 0) end = string::npos;

            string token = s.substr(start, end - start);

            if (!(removeEmpty && token.empty()))
            {
                // extract the token and add it to the result list
                result.push_back(token);
            }

            // start the next range
            if ((start = end) != string::npos) start += skip;
        }

        return result;
    }
}

Nice, how can this be done to ignore the delimiter(s) if wrapped in double quotes?

for example if white space is the delimiter ....

Item1 Item2 "Item 3" Item4

should be
Item1
Item2
Item 3
Item4

Edited 6 Years Ago by MichoRizo: n/a

Why would a test driver confuse people?? That seems to be a very critical part of an example like this!

you're kidding, right? creating a main function and calling the code above is trival ...

It may be trivial here, but why not provide it? Someone should be able to mindlessly say "ok lets see what this does -> copy+paste -> run it".

The article starter has earned a lot of community kudos, and such articles offer a bounty for quality replies.