| | |
screen scraping
Thread Solved |
•
•
Join Date: May 2008
Posts: 11
Reputation:
Solved Threads: 0
I am trying to run the following screen scraping script but it's not displaying any output. Can someone tell me what I'm doing wrong?
python Syntax (Toggle Plain Text)
from BeautifulSoup import BeautifulSoup import urllib url = 'http://toronto.en.craigslist.ca/search/cta?query=civic&minAsk=min&maxAsk=max' doc = urllib.urlopen(url).read() soup = BeautifulSoup(doc) tags = soup.findAll('p') for tag in tags: addate = tag.contents[0] path = tag.contents[1].attrs[0][1] desc = tag.next.next.string print addate, path, desc
I ran the code unmodified and got this:
How are you running the code? If you're just double clicking the .py file perhaps the console is closing before you're able to capture the output...
•
•
•
•
Originally Posted by output
Nov 12 - /tor/cto/915709511.html FS: 2004 Honda Civic Si Low Km - $12500 -
Nov 12 - /tor/cto/915669421.html FS; 1993 HONDA CIVIC CX HATCHBACK (EG) - $850
-
Nov 12 - /tor/cto/915654012.html FS: 1997 HONDA CIVIC CX HATCHBACK - $1500 -
Nov 11 - /yrk/cto/915504337.html 95 civic ex coupe -
Nov 11 - /mss/cto/915500509.html 997 HONDA CIVIC -
Nov 11 - /tor/cto/915425141.html 2006 honda civic DX-g - $15000 -
Nov 11 - /yrk/cto/915372101.html 1999 HONDA CIVIC EX, 4 DOOR, AUTO, $4000 !! -
$4000 -
... (Continues for 142 lines)
Last edited by jlm699; Nov 12th, 2008 at 7:01 am.
![]() |
Similar Threads
- PHP HTTP Screen-Scraping Class with Caching (PHP)
- prevent scraping (PHP)
- PHP Screen Scraping (PHP)
- Looking for table rows with Regular expression. (PHP)
Other Threads in the Python Forum
- Previous Thread: python script works in pywin but not in command window
- Next Thread: Detecting changes in RichTextCtrl
| Thread Tools | Search this Thread |
abrupt accessdenied anti apache application approximation argv array beginner book builtin calculator change converter countpasswordentry curved dan08 dictionaries dictionary dynamic edit enter examples file float format function gui heads homework import inches input java keyboard lapse launcher library line lines linux list lists loop microphone mouse movingimageswithpygame mysqlquery newb number numbers numeric output parameters parsing path phonebook plugin port prime programming projects py2exe pygame pyopengl python random recursion redirect remote reverse scrolledtext session simple software sprite statictext string strings syntax table terminal text textarea thread threading time tlapse trick tuple tutorial twoup ubuntu unicode unit urllib urllib2 variable wordgame wxpython






