list index out of range?

Question

fheppell 0 Newbie Poster

10 Years Ago

I'm writing a script that downloads files for web design, and this line produces

Traceback (most recent call last):
  File "bootlace.py", line 60, in <module>
    download(data['jquery'], 'Downloading jquery. (File size %s)', 'js/')
  File "bootlace.py", line 11, in download
    file_size = int(meta.getheaders("Content-Length")[0])
IndexError: list index out of range

That line is repeated almost exactly further up the page. The JSON that is loaded into the data array is:

{
      ...
      "jquery": "http://cdnjs.cloudflare.com/ajax/libs/jquery/2.0.3/jquery.min.js",
      ...
}

And as you can see, jquery exists in that json. So what is happening?

python

3 Contributors
4 Replies
2K Views
2 Days Discussion Span
Latest Post 10 Years Ago Latest Post by snippsat

snippsat 661 Master Poster

10 Years Ago

Make a list index out of range error.

>>> lst = [1,2,3]
>>> lst[2]
3
>>> lst[3]
Traceback (most recnt call last):
  File "<interactive input>", line 1, in <module>
IndexError: list index out of range

So it should be easy to understand if you look at list(print or repr).
Then you see that index you try use is out of list range.

This can mean that meta.getheaders("Content-Length") is returning an empty list.
The if you index it with [0],the empty list give back list index out of range error.
which might happen if something went wrong in the urlopen call.

Edited 10 Years Ago by snippsat

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

fheppell 0 Newbie Poster · Answer 1 · 2014-01-01T09:27:02+00:00

Not sure why that would be. The file does exist and I'm pulling something else from cdnjs, so it's not an issue with access rights.

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 2 · 2014-01-02T19:23:10+00:00

This works ...

''' urllib2_info_getheaders.py

tested with Python275
'''

import urllib2
import os

def download(url, download_dir):
    file_name = url.split('/')[-1]
    u = urllib2.urlopen(url)
    f = open(os.path.join(download_dir, file_name), 'wb')
    meta = u.info()
    file_size = int(meta.getheaders("Content-Length")[0])
    print "Downloading: %s Bytes: %s" % (file_name, file_size)


url = "http://www.cs.cmu.edu/~enron/enron_mail_20110402.tgz"
download_dir = "C:/Python27/Atest27/Bull" # future download option

download(url, download_dir)

''' result ...
Downloading: enron_mail_20110402.tgz Bytes: 443469519
'''

Try to test print meta and the type.

see:
http://nbviewer.ipython.org/github/ptwobrussell/Mining-the-Social-Web-2nd-Edition/blob/master/ipynb/Chapter%206%20-%20Mining%20Mailboxes.ipynb

snippsat 661 Master Poster · Answer 3 · 2014-01-03T10:56:41+00:00

One with Requests
So popular that it can be in standar libary in future,maybe in Python 3.4.

When file is big as here(over 400mb) is a good choice to not load all in memory.
So here here it download in chunks of 4096 bytes.

import requests

url = "http://www.cs.cmu.edu/~enron/enron_mail_20110402.tgz"
r = requests.get(url, stream=True)
file_name = url.split('/')[-1]
file_size = r.headers["Content-Length"]

with open(file_name, "wb") as f_out:
    for block in r.iter_content(4096):
        if not block:
            break
        f_out.write(block)
    print "Downloading: {} Bytes: {}".format(file_name, file_size)