Help with stripping characters from a random html output?

Question

ilikepaste 0 Newbie Poster

13 Years Ago

import urllib.request

page = urllib.request.urlopen("http://www.randompickupline.com/")
text = page.read().decode("utf8")

where = text.find('<p id="pickupline">')
start_of_line = where + 18
end_of_line = start_of_line + 150

line = (text[start_of_line:end_of_line])

print (line)

This is a basic html import for a text based game I'm writing for fun, and I'm making a dynamic string. However, the string output is random (can vary from 25-132 characters). I'd like to know how to remove characters after the target text. The desired text output is red, and the characters I wish to remove are green:

<h1>Pickup Line</h1><p id="pickupline">Look at my lips and your lips. They want to massage each other.</p>
<p id="directlink"><a href="http://www.randompickupline.com/pickupline.php?zid=1089" target="new">Direct link to this line: http://www.randompickupline.com/pickupline.php?zid=1089</a><br /></p>

character python strip

3 Contributors
4 Replies
199 Views
19 Hours Discussion Span
Latest Post 13 Years Ago Latest Post by ilikepaste

All 4 Replies

snippsat 661 Master Poster

13 Years Ago

Try this.

import urllib.request

page = urllib.request.urlopen("http://www.randompickupline.com/")
text = page.read().decode("utf8")

where = text.find('<p id="pickupline">')
start_of_line = where + 19
end_of_line = start_of_line + 150
line = (text[start_of_line:end_of_line])

pick_text = line.find('</p>')
print (line[:pick_text])

"""Out-->
Do I know you? (No.) That's a shame, I'd sure like to.
"""

When it comes to parse website,using a good parser like BeautifulSoup or lxml can make it eaiser.
Here is an example that do what you want,this is for python 2.x

from BeautifulSoup import BeautifulSoup
import urllib2

url = urllib2.urlopen("http://www.randompickupline.com/")
soup = BeautifulSoup(url)
tag = soup.find("p", {"id": "pickupline"})
print tag.text

"""Out-->
It's dark in here. Wait! It's because all of the light is shining on you.
"""

Edited 13 Years Ago by snippsat because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

ilikepaste 0 Newbie Poster · Answer 1 · 2011-01-17T14:17:13+00:00

EDIT*** In retrospect, I don't think I was very clear in my post. The target text (in green) is random, while the string I want to get rid of (in red) is fixed. So the whole string could look like this:

<h1>Pickup Line</h1><p id="pickupline">Are you accepting applications for your fan club?</p>
<p id="directlink"><a href="http://www.randompickupline.com/pickupline.php?zid=1089" target="new">Direct link to this line: http://www.randompickupline.com/pickupline.php?zid=1089</a><br /></p>

or this:

<h1>Pickup Line</h1><p id="pickupline">Do you wanna come to the Marines, or would your rather have a Marine come into you?</p>
<p id="directlink"><a href="http://www.randompickupline.com/pickupline.php?zid=1089" target="new">Direct link to this line: http://www.randompickupline.com/pickupline.php?zid=1089</a><br /></p>

richieking 44 Master Poster · Answer 2 · 2011-01-17T17:52:01+00:00

richieking 44 Master Poster

13 Years Ago

yep snippsat.
good job ;)

ilikepaste 0 Newbie Poster · Answer 3 · 2011-01-18T00:07:57+00:00

ilikepaste 0 Newbie Poster

13 Years Ago

This worked perfectly, thanks so much snippsat!

Help with stripping characters from a random html output?

Recommended Answers Collapse Answers

All 4 Replies

Recommended Answers