943,692 Members | Top Members by Rank

Ad:
  • C++ Discussion Thread
  • Unsolved
  • Views: 905
  • C++ RSS
Jan 11th, 2009
0

extracting a sentence

Expand Post »
Hello everybody,

I have a paragraph containing many sentences and I need to extract a certain sentence, which contains a certain. I have no problem finding the word with my code but my question is, how can I extract the whole sentence? The code should work, regardless of the position of the word in the sentence. Here I give an example:

A new crisis is emerging, a global food catastrophe that will reach further and be more crippling than anything the world has ever seen. The credit crunch and the reverberations of soaring oil prices around the world will pale in comparison to what is about to transpire, Donald Coxe, global portfolio strategist at BMO Financial Group said at the Empire Club's 14th annual investment outlook in Toronto on Thursday.

For instance, I search for the word catastrophe in this text and I find it. Now I need to extract the sentence:"A new crisis is emerging, a global food catastrophe that will reach further and be more crippling than anything the world has ever seen. "
I thought that I could use string.find until the characters ". " to get to the end of the sentence, but I also need to retrieve the part which comes before the word catastrophe. I would appreciate your ideas and help over this topic.

Thank you very much.
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
serhannn is offline Offline
7 posts
since Jan 2009
Jan 11th, 2009
0

Re: extracting a sentence

Proceed in the same way. Find the previous '.', then move forward until the first capital.
Reputation Points: 2023
Solved Threads: 644
Senior Poster
ddanbe is offline Offline
3,736 posts
since Oct 2008
Jan 11th, 2009
0

Re: extracting a sentence

Click to Expand / Collapse  Quote originally posted by ddanbe ...
Proceed in the same way. Find the previous '.', then move forward until the first capital.
But I can't decide whether it's beginning of a sentence looking at the capital. It can also be a name or another things.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
serhannn is offline Offline
7 posts
since Jan 2009
Jan 11th, 2009
0

Re: extracting a sentence

Come to think of it, it is not evident.
What would you do with a sentence like: This person holds a Ph.D., but this is not really a C++ problem any more.
Reputation Points: 2023
Solved Threads: 644
Senior Poster
ddanbe is offline Offline
3,736 posts
since Oct 2008
Jan 11th, 2009
0

Re: extracting a sentence

In a loop, I used this code to start from the found word and take until the first encounter with a dot. But I think there's something wrong with it, because in output there are some errors. I search in text files, which are actually source codes of some webpages, so they contain many HTML tags. They also interrupt. Is there any better algorithm to avoid that?

C++ Syntax (Toggle Plain Text)
  1. size_t pos1,pos2;
  2. pos1=line.find(keyword_vector[i]);
  3. pos2=line.find(".",pos1);
  4. string sentence = line.substr(pos1,pos2);

Thanks for help.
Last edited by serhannn; Jan 11th, 2009 at 11:56 am.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
serhannn is offline Offline
7 posts
since Jan 2009
Jan 11th, 2009
0

Re: extracting a sentence

Perhaps you should read the entire file in, strip out all html tags as you do so, that way you are left with just text. The go over and process what you have for sentences. But as ddanbe said, this isn't really a job for C++

The closest you will get to checking for the start and end of a sentence is [dot][space][capital] you will need to use that to check for both the beginning and end.

Chris

Chris
Reputation Points: 325
Solved Threads: 118
Master Poster
Freaky_Chris is offline Offline
702 posts
since Apr 2008
Jan 11th, 2009
0

Re: extracting a sentence

You still have to take into account that reading in a file has newlines. So, what if my sentence is nearing the end of the line, and I need to break it into the next line? If "word wrap" isn't used, but instead manual line breaks, sentences could be span on multiple lines :/
Team Colleague
Reputation Points: 361
Solved Threads: 214
Taboo Programmer
Comatose is offline Offline
2,413 posts
since Dec 2004

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in C++ Forum Timeline: Creating a dynamic 2D array
Next Thread in C++ Forum Timeline: while loop issue





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC