Hi there,

I am relatively new to C++ and was hoping someone could provide me with some guidance, I have a .txt file that contains a heap of HTML and I wish to extract a small portion of dynamic text from differing places. For example:

.txt file before filter
HTTP/1.1 200 OK
Content-Length: 48547
Content-Type: text/html; charset=UTF-8
Date: Sat, 31 Oct 2009 20:00:33 GMT
Content-Language: en-UK
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

.txt file after filter

So far I've retrieved the stream of data before storing it into a vector. I'm just not sure how to filter down the .txt file from here?

ofstream txt("test.txt", ios::app);

while (1) {
      string l = s.ReceiveLine();
      if (l.empty()) break;
      cout << l;

	  txt << l; // feed output into .txt

	  // feed stream into vector

Any help is much appreciated!

And what is the actual problem?
You can do this in two ways.
1. Use regular expressions.
2. Write a simple parser myself.

Bah, im struggling with regex!

I want to grab
<td scope="row" class="name">this text</td>

So far I've got: \w[A-Z\<]

But this gives me the < as well!

I managed to get close enough using