943,844 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Marked Solved
  • Views: 753
  • Python RSS
You are currently viewing page 1 of this multi-page discussion thread
Sep 21st, 2009
0

urllib2 problem

Expand Post »
Hi, I have this code:

python Syntax (Toggle Plain Text)
  1. import urllib2 as url
  2. import webbrowser
  3.  
  4. def extract(text, sub1, sub2):
  5. """
  6. extract a substring from text between first
  7. occurances of substrings sub1 and sub2
  8. """
  9. return text.split(sub1, 1)[-1].split(sub2, 1)[0]
  10. start="http://xkcd.com/"
  11. permlist=[]
  12. textlist=[]
  13. for i in range(1, 638):
  14. temp=start+str(i)
  15. permlist.append(str(url.urlopen(temp).readlines()[88]))
  16. textlist.append(str(url.urlopen(temp).readlines()[77]))
  17.  
  18. for i in permlist:
  19. i = extract(i, '<h3>Permanent link to this comic: ', '</h3>')
  20.  
  21. for i in textlist:
  22. i = extract(i, '<img src="http://imgs.xkcd.com/comics/scribblenauts.png" title="', '"')
  23.  
  24.  
  25. print zip(permlist, textlist)

and whenever I run it, it raises this error:
Python Syntax (Toggle Plain Text)
  1. Traceback (most recent call last):
  2. File "C:/Python26/test.py", line 15, in <module>
  3. permlist.append(str(url.urlopen(temp).readlines()[88]))
  4. File "C:\Python26\lib\urllib2.py", line 124, in urlopen
  5. return _opener.open(url, data, timeout)
  6. File "C:\Python26\lib\urllib2.py", line 389, in open
  7. response = meth(req, response)
  8. File "C:\Python26\lib\urllib2.py", line 502, in http_response
  9. 'http', request, response, code, msg, hdrs)
  10. File "C:\Python26\lib\urllib2.py", line 427, in error
  11. return self._call_chain(*args)
  12. File "C:\Python26\lib\urllib2.py", line 361, in _call_chain
  13. result = func(*args)
  14. File "C:\Python26\lib\urllib2.py", line 510, in http_error_default
  15. raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
  16. HTTPError: HTTP Error 404: Not Found

What is the problem, but mainly what can I do to fix it?

thanks in advance
Similar Threads
Reputation Points: 35
Solved Threads: 32
Posting Pro in Training
leegeorg07 is offline Offline
428 posts
since Jul 2008
Sep 21st, 2009
0

Re: urllib2 problem

Looks like one of the 638 web pages is not available. You should use a try/except trap for this case.
Reputation Points: 961
Solved Threads: 211
Nearly a Posting Maven
sneekula is offline Offline
2,413 posts
since Oct 2006
Sep 21st, 2009
0

Re: urllib2 problem

so what could I use?

sorry, at the moment I just want a quick fix and will figure out the best way when I have time
Reputation Points: 35
Solved Threads: 32
Posting Pro in Training
leegeorg07 is offline Offline
428 posts
since Jul 2008
Sep 21st, 2009
0

Re: urllib2 problem

python Syntax (Toggle Plain Text)
  1. for i in range(1, 638):
  2. try:
  3. temp=start+str(i)
  4. permlist.append(str(url.urlopen(temp).readlines()[88]))
  5. textlist.append(str(url.urlopen(temp).readlines()[77]))
  6. except Error: # catch any exception and continue the for loop
  7. print "Error at index %d."%i
Reputation Points: 38
Solved Threads: 18
Light Poster
djidjadji is offline Offline
28 posts
since Aug 2009
Sep 22nd, 2009
0

Re: urllib2 problem

Yeah you'll need to use exceptions, but if you want the script to continue after the error you're going to have to "pass" it, try this:

Python Syntax (Toggle Plain Text)
  1. for i in range(1, 638):
  2. try:
  3. temp=start+str(i)
  4. permlist.append(str(url.urlopen(temp).readlines()[88])) textlist.append(str(url.urlopen(temp).readlines()[77]))
  5. except Error, err:
  6. print "Index Error: %d at %d" % (err, i)
  7. pass

this will not only print the error and the location of the error but will also pass to keep the loop going.
Reputation Points: 35
Solved Threads: 22
Junior Poster
ov3rcl0ck is offline Offline
113 posts
since Sep 2009
Sep 22nd, 2009
0

Re: urllib2 problem

hey again, they are good ideas but whenever I try to run it again it says:

Python Syntax (Toggle Plain Text)
  1. Traceback (most recent call last):
  2. File "C:\Python26\test.py", line 18, in <module>
  3. except Error, err:
  4. NameError: name 'Error' is not defined
Reputation Points: 35
Solved Threads: 32
Posting Pro in Training
leegeorg07 is offline Offline
428 posts
since Jul 2008
Sep 22nd, 2009
0

Re: urllib2 problem

Since you don't know the specific error class, simply use ...
python Syntax (Toggle Plain Text)
  1. for i in range(1, 638):
  2. try:
  3. temp=start+str(i)
  4. permlist.append(str(url.urlopen(temp).readlines()[88]))
  5. textlist.append(str(url.urlopen(temp).readlines()[77]))
  6. except: # catch any exception and continue the for loop
  7. print "Error at index %d."%i
  8. pass
Moderator
Reputation Points: 1333
Solved Threads: 1403
DaniWeb's Hypocrite
vegaseat is offline Offline
5,792 posts
since Oct 2004
Sep 22nd, 2009
0

Re: urllib2 problem

ok thanks, trying it now, So that I can do better handling soon, how can I find the class?
Reputation Points: 35
Solved Threads: 32
Posting Pro in Training
leegeorg07 is offline Offline
428 posts
since Jul 2008
Sep 22nd, 2009
0

Re: urllib2 problem

Well you found it in your first post ...
HTTPError
Moderator
Reputation Points: 1333
Solved Threads: 1403
DaniWeb's Hypocrite
vegaseat is offline Offline
5,792 posts
since Oct 2004
Sep 23rd, 2009
0

Re: urllib2 problem

Oh ok thanks, whenever I run the zip part it uses the original text, not what I changed it to, I tried:
Python Syntax (Toggle Plain Text)
  1. for i, j in permlist, textlist:
  2. print i, ':', j
but it says that it is out of range, what can I do? I have googled it to no avail
Reputation Points: 35
Solved Threads: 32
Posting Pro in Training
leegeorg07 is offline Offline
428 posts
since Jul 2008

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: Open directory dialog box
Next Thread in Python Forum Timeline: Caeser cipher code breaker help (homework)





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC