1,105,340 Community Members

How to set timeout for reading from urls in urllib

Member Avatar
dilipkk
Newbie Poster
9 posts since Mar 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Hi,

I am downloading url using urllib2, the problem I am facing is some times server goes down and then read will take indefinite time. I dont want that, I want to raise a exception after 20 secs in this case. There is solution using signal.alarm but it works only for one thread. So I am looking for a solution without using signal.alarm.
How do I do this?

I tried setting defaulttimeout for socekt using, socekt.setdefaulttimeout(20) but it is only for opening url. It wont give exception while reading.

Thanks,
Dilip Kola

Member Avatar
slate
Posting Whiz
375 posts since Jun 2008
Reputation Points: 163 [?]
Q&As Helped to Solve: 106 [?]
Skill Endorsements: 10 [?]
 
0
 
Member Avatar
Gribouillis
Posting Maven
3,454 posts since Jul 2008
Reputation Points: 1,140 [?]
Q&As Helped to Solve: 884 [?]
Skill Endorsements: 18 [?]
Moderator
 
0
 

This source code seems to contain other solutions http://code.google.com/p/timeout-urllib2/source/browse/trunk/timeout_urllib2.py.

Also the urllib module in python 3.0 (and may be 2.6?) seems to accept a timeout parameter for urlopen.

Member Avatar
dilipkk
Newbie Poster
9 posts since Mar 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Thanks for reply!!

we don't need python2.6 or 3.0!!
import socket
socket.setdefaulttimeout(5)
import urllib2

Now when we call urlopen it will raise timeout exception after 5 sec. But problem is with read()( to download content) it wont stop after 5 secs :( . signal.alarm works it we do this download sequentially, but I want to parallelize this downloading. So how to do this? Can we use signal.alarm in multiple threads ?

Thanks

Member Avatar
slate
Posting Whiz
375 posts since Jun 2008
Reputation Points: 163 [?]
Q&As Helped to Solve: 106 [?]
Skill Endorsements: 10 [?]
 
0
 

Signal only woks in main thread, in the process that is.

Basically threading is not for "mortals". Not even in python. But you can be a hero:) I am not one.
For instance killing a thread from the main thread is generally not possible. It is discouraged in java, too.

import threading
import os
import time
def download(site):
  downloadbegin[site]=time.time()
  os.system('wget %s' % site) #testing
  pass #download here using urllib
websitelist=["slashdot.org","daniweb.com","http://ubuntu.supp.name/releases/intrepid/ubuntu-8.10-desktop-i386.iso"] 
threadlist =[]
downloadbegin={}

for site in websitelist:
   threadlist.append(threading.Thread( target=download,args=(site,)))

for thread in threadlist:
   thread.start()

for thread in threadlist:
   #you cannot terminate the thread safely here based on downloadbegin
   thread.join()

I would use subprocess and signal.
Killing a python interpreter subprocess with os.kill or win seems not so hard.

Member Avatar
Gribouillis
Posting Maven
3,454 posts since Jul 2008
Reputation Points: 1,140 [?]
Q&As Helped to Solve: 884 [?]
Skill Endorsements: 18 [?]
Moderator
 
0
 

Couldn't you try to use a timer to close the object returned by urlopen after a certain time ? For example

from urllib2 import urlopen
from threading import Timer
url = "http://www.python.org"
fh = urlopen(url)
t = timer(20.0, fh.close)
t.start()
data = fh.read()

I suppose that the fh.read() should raise an exception when fh.close is called by the timer. I can't test it because I dont know a site slow enough.

Member Avatar
dilipkk
Newbie Poster
9 posts since Mar 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Thanks for solution!!
The solution you have given is closing fh immediately so I changed to below, and it is working :).

from urllib2 import urlopen
from threading import Timer
url = "http://www.python.org"
def handler(fh):
fh.close()
fh = urlopen(url)
t = Timer(20.0, handler,[fh])
t.start()
data = fh.read()
t.cancel()
Thank you, Gribouillis.

Question Answered as of 5 Years Ago by slate and Gribouillis
You
This question has already been solved: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article