I am downloading url using urllib2, the problem I am facing is some times server goes down and then read will take indefinite time. I dont want that, I want to raise a exception after 20 secs in this case. There is solution using signal.alarm but it works only for one thread. So I am looking for a solution without using signal.alarm.
How do I do this?

I tried setting defaulttimeout for socekt using, socekt.setdefaulttimeout(20) but it is only for opening url. It wont give exception while reading.

Dilip Kola

Recommended Answers

This source code seems to contain other solutions http://code.google.com/p/timeout-urllib2/source/browse/trunk/timeout_urllib2.py.

Also the urllib module in python 3.0 (and may be 2.6?) seems to accept a timeout parameter for urlopen.

Jump to Post

All 6 Replies

Thanks for reply!!

we don't need python2.6 or 3.0!!
import socket
import urllib2

Now when we call urlopen it will raise timeout exception after 5 sec. But problem is with read()( to download content) it wont stop after 5 secs :( . signal.alarm works it we do this download sequentially, but I want to parallelize this downloading. So how to do this? Can we use signal.alarm in multiple threads ?


Signal only woks in main thread, in the process that is.

Basically threading is not for "mortals". Not even in python. But you can be a hero:) I am not one.
For instance killing a thread from the main thread is generally not possible. It is discouraged in java, too.

import threading
import os
import time
def download(site):
  os.system('wget %s' % site) #testing
  pass #download here using urllib
threadlist =[]

for site in websitelist:
   threadlist.append(threading.Thread( target=download,args=(site,)))

for thread in threadlist:

for thread in threadlist:
   #you cannot terminate the thread safely here based on downloadbegin

I would use subprocess and signal.
Killing a python interpreter subprocess with os.kill or win seems not so hard.

Couldn't you try to use a timer to close the object returned by urlopen after a certain time ? For example

from urllib2 import urlopen
from threading import Timer
url = "http://www.python.org"
fh = urlopen(url)
t = timer(20.0, fh.close)
data = fh.read()

I suppose that the fh.read() should raise an exception when fh.close is called by the timer. I can't test it because I dont know a site slow enough.

Thanks for solution!!
The solution you have given is closing fh immediately so I changed to below, and it is working :).

from urllib2 import urlopen
from threading import Timer
url = "http://www.python.org"
def handler(fh):
fh = urlopen(url)
t = Timer(20.0, handler,[fh])
data = fh.read()
Thank you, Gribouillis.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.20 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.