Hi!

I'm trying to search a html site for a spefic word and print the result thats x rows after that word.

If the html page looks like:
"Hello welcome to daniweb"

I want to search for welcome and look for what the next word is ("to" should be the result).

I know how to get the whole line where the searchword is located as seen below. But thats it, dont have a clue how to do what i want.

import urllib
sock = urllib.urlopen("http://blabla.com")

htmls = sock.read()
sock.close()

keyword = re.compile(r,"welcome")

for line in text:
       print line,

/Pluring

Recommended Answers

All 4 Replies

Using your example string, we can separate at the whitespace using split(), and then locate the index of "welcome" using index(). Finally slicing the list by the result of index() + 1 will give us "to".

Using it for finding a row of text x lines further should be a relatively similar process as long as each line is in a list; however the major difference would be iterating over the lines, and then searching each line for the desired word, then using the index of that line plus x to find the desired line in the outermost loop.

Thanks jlm699!

I'm very new to both python and programming.
Can you give me some heads up with the code?

Using your example string, we can separate at the whitespace using split(), and then locate the index of "welcome" using index(). Finally slicing the list by the result of index() + 1 will give us "to".

Here's some examples of using the above principals:

>>> example_msg = "Hello, welcome to DaniWeb"
>>> example_words = example_msg.split()
>>> index = example_words.index('welcome')
>>> example_words[index + 1]
'to'
>>>

Hope that's clear enough.

Thanks for the help and time.
Think i got it now :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.