I want to extract the following link "http://media1.santabanta.com/full5/indian celebrities(f)/aalesha/aalesha-1a.jpg".
There is a problem the link you want is loaded bye javascript.
We can see the link in downloaded text,then we can drop to simulate javascript and use regex(because Beautifulsoup cant find stuff in javascript)
from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup
import re
webpage = urlopen('http://www.santabanta.com/photos/aalesha/10066001.htm')
soup = BeautifulSoup(webpage)
#print soup
bac_img = re.search(r"""backgroundImage="url\('(.*)'""", str(soup))
print bac_img.group(1)
#http://media1.santabanta.com/full1/Indian Celebrities(F)/Aalesha/aalesha-1a.jpg
#Example of how to print image location,that is not loaded bye javascript
'''
imagelocation = soup.findAll('img')
for imgTag in imagelocation:
print imgTag['src']''' snippsat
Practically a Posting Shark
808 posts since Aug 2008
Reputation Points: 353
Solved Threads: 294