0

I am trying to run the following screen scraping script but it's not displaying any output. Can someone tell me what I'm doing wrong?

from BeautifulSoup import BeautifulSoup
import urllib

url = 'http://toronto.en.craigslist.ca/search/cta?query=civic&minAsk=min&maxAsk=max'

doc = urllib.urlopen(url).read()
soup = BeautifulSoup(doc)
tags = soup.findAll('p')
for tag in tags:
    addate = tag.contents[0]
    path = tag.contents[1].attrs[0][1]
    desc = tag.next.next.string
    print addate, path, desc
2
Contributors
2
Replies
4
Views
8 Years
Discussion Span
Last Post by ccandillo
0

I ran the code unmodified and got this:

Nov 12 - /tor/cto/915709511.html FS: 2004 Honda Civic Si Low Km - $12500 -
Nov 12 - /tor/cto/915669421.html FS; 1993 HONDA CIVIC CX HATCHBACK (EG) - $850
-
Nov 12 - /tor/cto/915654012.html FS: 1997 HONDA CIVIC CX HATCHBACK - $1500 -
Nov 11 - /yrk/cto/915504337.html 95 civic ex coupe -
Nov 11 - /mss/cto/915500509.html 997 HONDA CIVIC -
Nov 11 - /tor/cto/915425141.html 2006 honda civic DX-g - $15000 -
Nov 11 - /yrk/cto/915372101.html 1999 HONDA CIVIC EX, 4 DOOR, AUTO, $4000 !! -
$4000 -
... (Continues for 142 lines)

How are you running the code? If you're just double clicking the .py file perhaps the console is closing before you're able to capture the output...

0

My bad. I was running the code from idle and kept getting a 'RuntimeError: maximum recursion depth exceeded' error message. I am not quite sure why but it works from the console. Thanks!

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.