RSS Forums RSS

screen scraping

Please support our Python advertiser: Programming Forums
Thread Solved
Reply
Posts: 11
Reputation: ccandillo is an unknown quantity at this point 
Solved Threads: 0
ccandillo ccandillo is offline Offline
Newbie Poster

screen scraping

  #1  
Nov 11th, 2008
I am trying to run the following screen scraping script but it's not displaying any output. Can someone tell me what I'm doing wrong?

  1. from BeautifulSoup import BeautifulSoup
  2. import urllib
  3.  
  4. url = 'http://toronto.en.craigslist.ca/search/cta?query=civic&minAsk=min&maxAsk=max'
  5.  
  6. doc = urllib.urlopen(url).read()
  7. soup = BeautifulSoup(doc)
  8. tags = soup.findAll('p')
  9. for tag in tags:
  10. addate = tag.contents[0]
  11. path = tag.contents[1].attrs[0][1]
  12. desc = tag.next.next.string
  13. print addate, path, desc
AddThis Social Bookmark Button
Reply With Quote  
Posts: 770
Reputation: jlm699 is a jewel in the rough jlm699 is a jewel in the rough jlm699 is a jewel in the rough 
Solved Threads: 139
Sponsor
jlm699's Avatar
jlm699 jlm699 is offline Offline
Knows where his Towel is

Re: screen scraping

  #2  
Nov 12th, 2008
I ran the code unmodified and got this:
Originally Posted by output
Nov 12 - /tor/cto/915709511.html FS: 2004 Honda Civic Si Low Km - $12500 -
Nov 12 - /tor/cto/915669421.html FS; 1993 HONDA CIVIC CX HATCHBACK (EG) - $850
-
Nov 12 - /tor/cto/915654012.html FS: 1997 HONDA CIVIC CX HATCHBACK - $1500 -
Nov 11 - /yrk/cto/915504337.html 95 civic ex coupe -
Nov 11 - /mss/cto/915500509.html 997 HONDA CIVIC -
Nov 11 - /tor/cto/915425141.html 2006 honda civic DX-g - $15000 -
Nov 11 - /yrk/cto/915372101.html 1999 HONDA CIVIC EX, 4 DOOR, AUTO, $4000 !! -
$4000 -
... (Continues for 142 lines)

How are you running the code? If you're just double clicking the .py file perhaps the console is closing before you're able to capture the output...
Last edited by jlm699 : Nov 12th, 2008 at 6:01 am.
1. Use Code Tags.
2. Homework? Show Effort.
3. Keep discussions on the forum: no PMs
Reply With Quote  
Posts: 11
Reputation: ccandillo is an unknown quantity at this point 
Solved Threads: 0
ccandillo ccandillo is offline Offline
Newbie Poster

Re: screen scraping

  #3  
Nov 12th, 2008
My bad. I was running the code from idle and kept getting a 'RuntimeError: maximum recursion depth exceeded' error message. I am not quite sure why but it works from the console. Thanks!
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.



Views: 548 | Replies: 2 | Currently Viewing: 1 (0 members and 1 guests)

 

Thread Tools Display Modes
Forums | Blogs | Tutorials | Code Snippets | Whitepapers | RSS Feeds | Advertising
All times are GMT -4. The time now is 3:37 pm.
Newsletter Archive - Sitemap - Privacy Statement - Acceptable Use Policy - Contact Us
Forum system based on vBulletin Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC