screen scraping

Question

ccandillo 0 Newbie Poster

16 Years Ago

I am trying to run the following screen scraping script but it's not displaying any output. Can someone tell me what I'm doing wrong?

from BeautifulSoup import BeautifulSoup
import urllib

url = 'http://toronto.en.craigslist.ca/search/cta?query=civic&minAsk=min&maxAsk=max'

doc = urllib.urlopen(url).read()
soup = BeautifulSoup(doc)
tags = soup.findAll('p')
for tag in tags:
    addate = tag.contents[0]
    path = tag.contents[1].attrs[0][1]
    desc = tag.next.next.string
    print addate, path, desc

python

2 Contributors
2 Replies
174 Views
17 Hours Discussion Span
Latest Post 16 Years Ago Latest Post by ccandillo

All 2 Replies

jlm699 320 Veteran Poster

16 Years Ago

I ran the code unmodified and got this:

Nov 12 - /tor/cto/915709511.html FS: 2004 Honda Civic Si Low Km - $12500 -
Nov 12 - /tor/cto/915669421.html FS; 1993 HONDA CIVIC CX HATCHBACK (EG) - $850
-
Nov 12 - /tor/cto/915654012.html FS: 1997 HONDA CIVIC CX HATCHBACK - $1500 -
Nov 11 - /yrk/cto/915504337.html 95 civic ex coupe -
Nov 11 - /mss/cto/915500509.html 997 HONDA CIVIC -
Nov 11 - /tor/cto/915425141.html 2006 honda civic DX-g - $15000 -
Nov 11 - /yrk/cto/915372101.html 1999 HONDA CIVIC EX, 4 DOOR, AUTO, $4000 !! -
$4000 -
... (Continues for 142 lines)

How are you running the code? If you're just double clicking the .py file perhaps the console is closing before you're able to capture the output...

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

ccandillo 0 Newbie Poster · Answer 1 · 2008-11-12T19:54:12+00:00

My bad. I was running the code from idle and kept getting a 'RuntimeError: maximum recursion depth exceeded' error message. I am not quite sure why but it works from the console. Thanks!

screen scraping

Recommended Answers Collapse Answers

All 2 Replies

Recommended Answers