I'm brand new to python (and programming) and trying to write a script that will download a .csv from the Internet and then allow me to plot data from the .csv. I already have the plot working, and I can grab the file locally, but I can't figure out how to download it using the script. I've been reading about httplib and urllib2 but I'm confused. Can anyone help?

Also, eventually I'll be downloading from an https link where I'll have to provide credentials. For now, however, I'm not worrying about that part. Just downloading any old file from an http link will get me further than where I'm currently at! Thanks in advance.

I think this is what you're looking for:

urllib.urlretrieve(url_of_file, destination_on_local_filesystem)

Note this could vary depending on what version of Python you have. This is from 2.5... Other versions of python can use something like:

urllib.request.urlretrieve(url, local_file_name)

I think this is what you're looking for:

urllib.urlretrieve(url_of_file, destination_on_local_filesystem)

Thanks for the help...I'm getting the following error:

IOError                                   Traceback (most recent call last)

C:\Python25\lib\urllib.pyc in retrieve(self, url, filename, reporthook, data)
    223         headers = fp.info()
    224         if filename:
--> 225             tfp = open(filename, 'wb')
    226         else:
    227             import tempfile

IOError: [Errno 13] Permission denied: 'c:'

All users and groups have full permissions to the c: drive. Any ideas what the problem might be?

Can you display the relevant bit of code that you're calling the urlrequest with?

import urllib
urllib.urlretrieve('url', 'c:')

That's it.

Let me get this straight... you want to replace your C: drive with a non-existant file called 'url' ?[/snark]

But seriously, the urlretrieve file needs to know what you want the save that url AS, not where you want to put it. If you don't supply the filename parameter it simply puts it in the current working directory, other wise it needs to know where you want it as well as what you want to call it.

import urllib
urllib.urlretrieve('url', 'c:\\url')

Thank you...I had tried that but got a message "directory does not exist." I assumed this meant that it would automatically create the filename (using the url). Then I tried two backslashes (as you demonstrated above) and it worked.

There are obviously some basic principles I'm missing. Didn't know anything about the double slash.

Yes in Python the backslash is used to denote special characters:
\n is a newline character
\t is a tab
\\ is a backslash.

There's a number of ways to avoid that. One, use raw strings, which Python assumes has no special characters:

>>> print 'C:\foo\bar\lipsum\delorean'
C:ooar\lipsum\delorean
>>> print 'C:\\foo\\bar\\lipsum\\delorean'
C:\foo\bar\lipsum\delorean
>>> print r'C:\foo\bar\lipsum\delorean'
C:\foo\bar\lipsum\delorean
>>>

Alternately you could try the forward slash (which is a *nix path convention) like: 'C:/foo/bar/lipsum/delorean' I won't allow myself to use that as it's just too foreign on a windows machine. I prefer to use os.path.join, which oddly enough still requires the double slash on the drive letter, but aside from that no slashes required:

>>> import os
>>> os.path.join('C:\\', 'foo','bar','lipsum','delorean')
'C:\\foo\\bar\\lipsum\\delorean'
>>>

Yes in Python the backslash is used to denote special characters:
\n is a newline character
\t is a tab
\\ is a backslash.


'C:\\foo\\bar\\lipsum\\delorean'

I see. Thanks a lot, that was very helpful!!

>I won't allow myself to use that as it's just too foreign on a windows machine.
I don't agree.
Python guarantees that the '/' will be converted to whatever underlying platform is using. Hence it is *always* portable to use '/' instead of '\\'.
This guarantee is actually from the fact that C compiler itself use this convention.
So, use '/' and be portable.

>I won't allow myself to use that as it's just too foreign on a windows machine.
I don't agree.
Python guarantees that the '/' will be converted to whatever underlying platform is using. Hence it is *always* portable to use '/' instead of '\\'.
This guarantee is actually from the fact that C compiler itself use this convention.
So, use '/' and be portable.

Well that's good to know. I always used os.path.join for portability but if '/' is always converted then that's much simpler.

This article has been dead for over six months. Start a new discussion instead.