Python Regular Expression Help
Hi all,
I wanna extract a certain link from a web page using python regular expression.
The scenario is like this..
The code:
blah...
...
....
http://www.test.com/file.ext " style="top:0px;width:100%;"
....
blah
blah
blah
I wanna extract the url "http://www.test.com/file.ext" from the page using python regular expression.
Thanks in advance!
debasishgang7
Junior Poster in Training
91 posts since Oct 2009
Reputation Points: 10
Solved Threads: 0
snippsat
Practically a Posting Shark
808 posts since Aug 2008
Reputation Points: 353
Solved Threads: 294
Well thanks for your suggestion,but in this case its not working.I am getting "IndexError: list index out of range" error.May be its because i am trying with huge page.And one more thing is the part of this html code is inactive means its between this tags.
I will be very thank full if you can solve this with a regular expression which will extract the url between
debasishgang7
Junior Poster in Training
91 posts since Oct 2009
Reputation Points: 10
Solved Threads: 0
What have you tried looks simple match betseen 'start and end tags'?
pyTony
pyMod
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
Gribouillis
Posting Maven
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
Well thanks for your suggestion,but in this case its not working.I am getting "IndexError: list index out of range" error.May be its because i am trying with huge page.And one more thing is the part of this html code is inactive means its between this tags.
That may be because you making and error,impossibile to say without seeing some code.
Regex ...no,but here something you can look at.
>>> import re
>>> re.findall(r'class="test" src="(.*?)"', html)
['http://www.test.com/file.ext']
>>> ''.join(re.findall(r'class="test" src="(.*?)"', html))
'http://www.test.com/file.ext'
>>>
snippsat
Practically a Posting Shark
808 posts since Aug 2008
Reputation Points: 353
Solved Threads: 294