Hi, i've used the Beautifulsoup module to parse the site and grab the img tag from it, but the problem is , Beautifulsoup while parsing not returning the whole content of the given url. The truncated content contain the image location I want to download:
from urllib2 import urlopen from BeautifulSoup import BeautifulSoup #reading the webpage source webpage = urlopen('http://www.santabanta.com/photos/aalesha/10066001.htm').read() #putting all the webpage content into variable named soup using beautifulsoup soup = BeautifulSoup(''.join(webpage)) print soup #finding all the img tags imagelocation = soup.findAll('img') #printing the img content for i in imagelocation: print i
I want to extract the following link "http://media1.santabanta.com/full5/indian celebrities(f)/aalesha/aalesha-1a.jpg". If you will see the source code of webpage you will find <img > tag at line no. 234 but it is not present after parsing it with beautifulsoup. when i do soup.prettify() i'll get whole webpage parsed otherwise some fields are missing. Can someone tell me what is that i'm doing wrong.