Hi.
How i can ask my crawler to print only the text of all <li></li> tags in a url page?
I want to save the text of all <li></li> tags in a text file (without<li></li>
words.)
Niloofar24
15
Posting Whiz
Recommended Answers
Jump to PostThe problem is that your html variable is just a string containing this value
https://www.daniweb.com/software-development/python/threads/492669/how-to-print-only-the-content-of-all-tags-from-a-url-page
and not the actual HTML code ... the library that you have imported urllib2 .. use it to get the code from that page
Read
Jump to PostUse regular expressions Click Here
No no no just to make it clear :)
Have to post this link again.
Use a parser Beautifulsoup or lxml.from bs4 …
All 8 Replies
Slavi
94
Master Poster
Featured Poster
Niloofar24
15
Posting Whiz
Slavi
94
Master Poster
Featured Poster
snippsat
661
Master Poster
Slavi
commented:
Great read, but this 'Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins.. he's gone too far=D
+6
Niloofar24
15
Posting Whiz
snippsat
661
Master Poster
Niloofar24
15
Posting Whiz
Niloofar24
15
Posting Whiz
Be a part of the DaniWeb community
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.