1.11M Members

How to set timeout for reading from urls in urllib

 
0
 

Hi,

I am downloading url using urllib2, the problem I am facing is some times server goes down and then read will take indefinite time. I dont want that, I want to raise a exception after 20 secs in this case. There is solution using signal.alarm but it works only for one thread. So I am looking for a solution without using signal.alarm.
How do I do this?

I tried setting defaulttimeout for socekt using, socekt.setdefaulttimeout(20) but it is only for opening url. It wont give exception while reading.

Thanks,
Dilip Kola

 
0
 
 
0
 

Thanks for reply!!

we don't need python2.6 or 3.0!!
import socket
socket.setdefaulttimeout(5)
import urllib2

Now when we call urlopen it will raise timeout exception after 5 sec. But problem is with read()( to download content) it wont stop after 5 secs :( . signal.alarm works it we do this download sequentially, but I want to parallelize this downloading. So how to do this? Can we use signal.alarm in multiple threads ?

Thanks

 
0
 

Signal only woks in main thread, in the process that is.

Basically threading is not for "mortals". Not even in python. But you can be a hero:) I am not one.
For instance killing a thread from the main thread is generally not possible. It is discouraged in java, too.

import threading
import os
import time
def download(site):
  downloadbegin[site]=time.time()
  os.system('wget %s' % site) #testing
  pass #download here using urllib
websitelist=["slashdot.org","daniweb.com","http://ubuntu.supp.name/releases/intrepid/ubuntu-8.10-desktop-i386.iso"] 
threadlist =[]
downloadbegin={}

for site in websitelist:
   threadlist.append(threading.Thread( target=download,args=(site,)))

for thread in threadlist:
   thread.start()

for thread in threadlist:
   #you cannot terminate the thread safely here based on downloadbegin
   thread.join()

I would use subprocess and signal.
Killing a python interpreter subprocess with os.kill or win seems not so hard.

 
0
 

Couldn't you try to use a timer to close the object returned by urlopen after a certain time ? For example

from urllib2 import urlopen
from threading import Timer
url = "http://www.python.org"
fh = urlopen(url)
t = timer(20.0, fh.close)
t.start()
data = fh.read()

I suppose that the fh.read() should raise an exception when fh.close is called by the timer. I can't test it because I dont know a site slow enough.

 
0
 

Thanks for solution!!
The solution you have given is closing fh immediately so I changed to below, and it is working :).

from urllib2 import urlopen
from threading import Timer
url = "http://www.python.org"
def handler(fh):
fh.close()
fh = urlopen(url)
t = Timer(20.0, handler,[fh])
t.start()
data = fh.read()
t.cancel()
Thank you, Gribouillis.

Question Answered as of 5 Years Ago by slate and Gribouillis
You
This question has already been solved: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article