-
Python (
http://www.daniweb.com/forums/forum114.html)
| ccandillo | Nov 11th, 2008 10:43 pm | |
| screen scraping I am trying to run the following screen scraping script but it's not displaying any output. Can someone tell me what I'm doing wrong?
from BeautifulSoup import BeautifulSoup
import urllib
url = 'http://toronto.en.craigslist.ca/search/cta?query=civic&minAsk=min&maxAsk=max'
doc = urllib.urlopen(url).read()
soup = BeautifulSoup(doc)
tags = soup.findAll('p')
for tag in tags:
addate = tag.contents[0]
path = tag.contents[1].attrs[0][1]
desc = tag.next.next.string
print addate, path, desc |
| jlm699 | Nov 12th, 2008 7:00 am | |
| Re: screen scraping I ran the code unmodified and got this:
Quote: Originally Posted by output Nov 12 - /tor/cto/915709511.html FS: 2004 Honda Civic Si Low Km - $12500 -
Nov 12 - /tor/cto/915669421.html FS; 1993 HONDA CIVIC CX HATCHBACK (EG) - $850
-
Nov 12 - /tor/cto/915654012.html FS: 1997 HONDA CIVIC CX HATCHBACK - $1500 -
Nov 11 - /yrk/cto/915504337.html 95 civic ex coupe -
Nov 11 - /mss/cto/915500509.html 997 HONDA CIVIC -
Nov 11 - /tor/cto/915425141.html 2006 honda civic DX-g - $15000 -
Nov 11 - /yrk/cto/915372101.html 1999 HONDA CIVIC EX, 4 DOOR, AUTO, $4000 !! -
$4000 -
... (Continues for 142 lines) | How are you running the code? If you're just double clicking the .py file perhaps the console is closing before you're able to capture the output... |
| ccandillo | Nov 12th, 2008 9:54 am | |
| Re: screen scraping My bad. I was running the code from idle and kept getting a 'RuntimeError: maximum recursion depth exceeded' error message. I am not quite sure why but it works from the console. Thanks! |
| All times are GMT -4. The time now is 1:13 pm. | |
Forum system based on vBulletin Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
©2003 - 2009 DaniWeb® LLC