Tokenizer with <p id> tags

Please support our C++ advertiser: Intel Parallel Studio Home
Reply

Join Date: May 2007
Posts: 11
Reputation: katerinaaa is an unknown quantity at this point 
Solved Threads: 0
katerinaaa katerinaaa is offline Offline
Newbie Poster

Tokenizer with <p id> tags

 
0
  #1
May 31st, 2007
Hi,
I would like to ask if anyone know how can I create a tokenizer for a txt file in C++.
I find it difficult because there are not only words but there are also numbers and <p id> tags.

I have attached the file that is needed to be tokenized.

Could anyone help me ?

Thanks a lot
Reply With Quote Quick reply to this message  
Join Date: Sep 2004
Posts: 7,822
Reputation: Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute Narue has a reputation beyond repute 
Solved Threads: 748
Team Colleague
Narue's Avatar
Narue Narue is online now Online
Senior Bitch

Re: Tokenizer with <p id> tags

 
0
  #2
May 31st, 2007
The file wasn't attached.
New members chased away this month: 3
Reply With Quote Quick reply to this message  
Join Date: May 2007
Posts: 1,873
Reputation: twomers has a spectacular aura about twomers has a spectacular aura about twomers has a spectacular aura about 
Solved Threads: 56
twomers's Avatar
twomers twomers is offline Offline
Posting Virtuoso

Re: Tokenizer with <p id> tags

 
0
  #3
May 31st, 2007
Maybe that's the point ...
I blag!?
"Mr Kitty, you have to live in the attic now. Here, write a diary."
I am the Walrus!
Reply With Quote Quick reply to this message  
Join Date: Dec 2006
Posts: 1,089
Reputation: vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all 
Solved Threads: 164
vijayan121 vijayan121 is offline Offline
Veteran Poster

Re: Tokenizer with <p id> tags

 
0
  #4
Jun 1st, 2007
you could use the boost tokenizer library. here are a few links:
http://www.boost.org/libs/tokenizer/index.html
http://www-eleves-isia.cma.fr/docume...r/examples.cpp

you could also use the boost string algorithms library (if the file is read line by line into a string)
http://www.boost.org/doc/html/string_algo.html
Reply With Quote Quick reply to this message  
Join Date: May 2007
Posts: 11
Reputation: katerinaaa is an unknown quantity at this point 
Solved Threads: 0
katerinaaa katerinaaa is offline Offline
Newbie Poster

Re: Tokenizer with <p id> tags

 
0
  #5
Jun 1st, 2007
The file is something like that :

<P ID=1>
CONTENTS
</P>
<P ID=2>
VOLUME I
</P>
<P ID=3>
BOOK FIRST.--A JUST MAN
</P>
<P ID=4>
CHAPTER
I. M. Myriel
II. M. Myriel becomes M. Welcome
III. A Hard Bishopric for a Good Bishop
IV. Works corresponding to Words
V. Monseigneur Bienvenu made his Cassocks last too long
VI. Who guarded his House for him
VII. Cravatte
VIII. Philosophy after Drinking
IX. The Brother as depicted by the Sister
X. The Bishop in the Presence of an Unknown Light
XI. A Restriction
XII. The Solitude of Monseigneur Welcome
XIII. What he believed
XIV. What he thought
</P>
<P ID=5>
BOOK SECOND.--THE FALL
</P>
<P ID=6>
I. The Evening of a Day of Walking
II. Prudence counselled to Wisdom
III. The Heroism of Passive Obedience
IV. Details concerning the Cheese-Dairies of Pontarlier
V. Tranquillity
VI. Jean Valjean
VII. The Interior of Despair
VIII. Billows and Shadows
IX. New Troubles
X. The Man aroused
XI. What he does
XII. The Bishop works
XIII. Little Gervais
</P>
Last edited by katerinaaa; Jun 1st, 2007 at 3:26 am.
Attached Files
File Type: txt input.txt (21.5 KB, 4 views)
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the C++ Forum
Thread Tools Search this Thread



Tag cloud for C++
About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC