Does anyone know how to search a text file for just alphanumeric words using regex? And could you show me?

Recommended Answers

All 3 Replies

It depends on what you want to do with the code.
If you want a vector of all the words or a particular one
Try this

std::string Text;
// set text equal to something ...
std::string searchString="(\\S+)\\s";
boost::regex Re(searchString);
std::vector<std::string> Aout;
boost::regex_split(std::back_inserter(Aout), Text, Re);   // Destroys Text string in process

The regex expression is not great, (or is that what you are after)?
but you can fill anything you like in there. There are a lot of different ways of using regex, the boost site has an extensive user manual.

Can you be a bit more specific, it this is not what you want?

If I had a text file with words like this:
word* (start) enable! get&& Words

it would just find the A_Z characters...like this:

word start enable get Words

Ok the question really has highlighted just how ignorant I am about
regex.

I have a solution to the problem BUT I don't think it is elegant and
I am sure that others know how to do this properly.

Anyway here is what I have:

boost::regex Re("([A-z0-9]+)");
std::string Text;
std::string Out;         // output space
// Text set somehow or use iterators to a stream

boost::sregex_iterator m1(Text.begin(),Text.end(),Re);
boost::sregex_iterator empty;
while(m1!=empty)
   { 
      Out+= *m1;
      Out+=' ';         // add space .
      m1++;
    }

Now you might like to add the space into the regex code and then
not worry about the word spacing, but you can experiment.

What I am not sure of is m1!=empty the best way to terminate this?
If you figure it out something better or anyone else please add a comment.

Hope this is helpful.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.