0

my code is:

import urllib 
import urllib2 
import re

#get URL
urla='http://www.sc.iitb.ac.in/~bijnan/personal-details.htm'
#connect to this website 
request=urllib.urlopen(urla)
#get html file from this website
html=request.read() 
#get the address from above html file

print html

How can i find all his addresses in this html,can use the re.complie() method to match the address pattern,but how?

2
Contributors
1
Reply
2
Views
8 Years
Discussion Span
Last Post by djidjadji
0

It made with Microsoft Word.
The WORST HTML editor there is.

A good cadidate to use is module: HTMLParser

First clean up the file and get rid of all the style stuff and comments.
Then you have a better view of what the file is made up of.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.