urllib2 problem

Please support our Python advertiser: Programming Forums - DaniWeb Sister Site
Reply

Join Date: Jul 2008
Posts: 403
Reputation: leegeorg07 is an unknown quantity at this point 
Solved Threads: 31
leegeorg07's Avatar
leegeorg07 leegeorg07 is offline Offline
Posting Pro in Training

urllib2 problem

 
0
  #1
Sep 21st, 2009
Hi, I have this code:

  1. import urllib2 as url
  2. import webbrowser
  3.  
  4. def extract(text, sub1, sub2):
  5. """
  6. extract a substring from text between first
  7. occurances of substrings sub1 and sub2
  8. """
  9. return text.split(sub1, 1)[-1].split(sub2, 1)[0]
  10. start="http://xkcd.com/"
  11. permlist=[]
  12. textlist=[]
  13. for i in range(1, 638):
  14. temp=start+str(i)
  15. permlist.append(str(url.urlopen(temp).readlines()[88]))
  16. textlist.append(str(url.urlopen(temp).readlines()[77]))
  17.  
  18. for i in permlist:
  19. i = extract(i, '<h3>Permanent link to this comic: ', '</h3>')
  20.  
  21. for i in textlist:
  22. i = extract(i, '<img src="http://imgs.xkcd.com/comics/scribblenauts.png" title="', '"')
  23.  
  24.  
  25. print zip(permlist, textlist)

and whenever I run it, it raises this error:
  1. Traceback (most recent call last):
  2. File "C:/Python26/test.py", line 15, in <module>
  3. permlist.append(str(url.urlopen(temp).readlines()[88]))
  4. File "C:\Python26\lib\urllib2.py", line 124, in urlopen
  5. return _opener.open(url, data, timeout)
  6. File "C:\Python26\lib\urllib2.py", line 389, in open
  7. response = meth(req, response)
  8. File "C:\Python26\lib\urllib2.py", line 502, in http_response
  9. 'http', request, response, code, msg, hdrs)
  10. File "C:\Python26\lib\urllib2.py", line 427, in error
  11. return self._call_chain(*args)
  12. File "C:\Python26\lib\urllib2.py", line 361, in _call_chain
  13. result = func(*args)
  14. File "C:\Python26\lib\urllib2.py", line 510, in http_error_default
  15. raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
  16. HTTPError: HTTP Error 404: Not Found

What is the problem, but mainly what can I do to fix it?

thanks in advance
don't judge me because I'm a year 8!

'it is better to fight for something than to live for nothing'General George S Patton
Reply With Quote Quick reply to this message  
Join Date: Oct 2006
Posts: 2,292
Reputation: sneekula has a spectacular aura about sneekula has a spectacular aura about 
Solved Threads: 178
sneekula's Avatar
sneekula sneekula is offline Offline
Nearly a Posting Maven

Re: urllib2 problem

 
0
  #2
Sep 21st, 2009
Looks like one of the 638 web pages is not available. You should use a try/except trap for this case.
No one died when Clinton lied.
Reply With Quote Quick reply to this message  
Join Date: Jul 2008
Posts: 403
Reputation: leegeorg07 is an unknown quantity at this point 
Solved Threads: 31
leegeorg07's Avatar
leegeorg07 leegeorg07 is offline Offline
Posting Pro in Training

Re: urllib2 problem

 
0
  #3
Sep 21st, 2009
so what could I use?

sorry, at the moment I just want a quick fix and will figure out the best way when I have time
don't judge me because I'm a year 8!

'it is better to fight for something than to live for nothing'General George S Patton
Reply With Quote Quick reply to this message  
Join Date: Aug 2009
Posts: 18
Reputation: djidjadji is an unknown quantity at this point 
Solved Threads: 5
djidjadji djidjadji is offline Offline
Newbie Poster

Re: urllib2 problem

 
0
  #4
Sep 21st, 2009
  1. for i in range(1, 638):
  2. try:
  3. temp=start+str(i)
  4. permlist.append(str(url.urlopen(temp).readlines()[88]))
  5. textlist.append(str(url.urlopen(temp).readlines()[77]))
  6. except Error: # catch any exception and continue the for loop
  7. print "Error at index %d."%i
Reply With Quote Quick reply to this message  
Join Date: Sep 2009
Posts: 108
Reputation: ov3rcl0ck is an unknown quantity at this point 
Solved Threads: 12
ov3rcl0ck ov3rcl0ck is offline Offline
Junior Poster

Re: urllib2 problem

 
0
  #5
Sep 22nd, 2009
Yeah you'll need to use exceptions, but if you want the script to continue after the error you're going to have to "pass" it, try this:

  1. for i in range(1, 638):
  2. try:
  3. temp=start+str(i)
  4. permlist.append(str(url.urlopen(temp).readlines()[88])) textlist.append(str(url.urlopen(temp).readlines()[77]))
  5. except Error, err:
  6. print "Index Error: %d at %d" % (err, i)
  7. pass

this will not only print the error and the location of the error but will also pass to keep the loop going.
Reply With Quote Quick reply to this message  
Join Date: Jul 2008
Posts: 403
Reputation: leegeorg07 is an unknown quantity at this point 
Solved Threads: 31
leegeorg07's Avatar
leegeorg07 leegeorg07 is offline Offline
Posting Pro in Training

Re: urllib2 problem

 
0
  #6
Sep 22nd, 2009
hey again, they are good ideas but whenever I try to run it again it says:

  1. Traceback (most recent call last):
  2. File "C:\Python26\test.py", line 18, in <module>
  3. except Error, err:
  4. NameError: name 'Error' is not defined
don't judge me because I'm a year 8!

'it is better to fight for something than to live for nothing'General George S Patton
Reply With Quote Quick reply to this message  
Join Date: Oct 2004
Posts: 4,109
Reputation: vegaseat is just really nice vegaseat is just really nice vegaseat is just really nice vegaseat is just really nice vegaseat is just really nice 
Solved Threads: 943
Moderator
vegaseat's Avatar
vegaseat vegaseat is offline Offline
DaniWeb's Hypocrite

Re: urllib2 problem

 
0
  #7
Sep 22nd, 2009
Since you don't know the specific error class, simply use ...
  1. for i in range(1, 638):
  2. try:
  3. temp=start+str(i)
  4. permlist.append(str(url.urlopen(temp).readlines()[88]))
  5. textlist.append(str(url.urlopen(temp).readlines()[77]))
  6. except: # catch any exception and continue the for loop
  7. print "Error at index %d."%i
  8. pass
May 'the Google' be with you!
Reply With Quote Quick reply to this message  
Join Date: Jul 2008
Posts: 403
Reputation: leegeorg07 is an unknown quantity at this point 
Solved Threads: 31
leegeorg07's Avatar
leegeorg07 leegeorg07 is offline Offline
Posting Pro in Training

Re: urllib2 problem

 
0
  #8
Sep 22nd, 2009
ok thanks, trying it now, So that I can do better handling soon, how can I find the class?
don't judge me because I'm a year 8!

'it is better to fight for something than to live for nothing'General George S Patton
Reply With Quote Quick reply to this message  
Join Date: Oct 2004
Posts: 4,109
Reputation: vegaseat is just really nice vegaseat is just really nice vegaseat is just really nice vegaseat is just really nice vegaseat is just really nice 
Solved Threads: 943
Moderator
vegaseat's Avatar
vegaseat vegaseat is offline Offline
DaniWeb's Hypocrite

Re: urllib2 problem

 
0
  #9
Sep 22nd, 2009
Well you found it in your first post ...
HTTPError
May 'the Google' be with you!
Reply With Quote Quick reply to this message  
Join Date: Jul 2008
Posts: 403
Reputation: leegeorg07 is an unknown quantity at this point 
Solved Threads: 31
leegeorg07's Avatar
leegeorg07 leegeorg07 is offline Offline
Posting Pro in Training

Re: urllib2 problem

 
0
  #10
Sep 23rd, 2009
Oh ok thanks, whenever I run the zip part it uses the original text, not what I changed it to, I tried:
  1. for i, j in permlist, textlist:
  2. print i, ':', j
but it says that it is out of range, what can I do? I have googled it to no avail
don't judge me because I'm a year 8!

'it is better to fight for something than to live for nothing'General George S Patton
Reply With Quote Quick reply to this message  
Reply

Message:


Thread Tools Search this Thread



Tag cloud for Python
About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC