Is there someway I can parse badly written HTML code in python? I want to get some info from a web page which uses HTML tables for it's formatting and I found numerous flaws in the code using w3cs validator. can I parse this code in python?
For those of you who use Python3:
BeautifulSoup works fine with Python3 if you copy BeautifulSoup.py (version3.0.7a or lower)
and sgmllib.py (find it typically in C:\Python25\Lib)
to a separate directory and convert both programs with 2to3.py
thanks..Both useful posts because I use python 3 and I'm going to look around about beautiful soup. (For anyone else reading this thread, bad HTML code refers to badly constructed bode but this code displays well enough in firefox)