I am a bigginer in python and even programming. I want to use python in my thesis for reading and writing large size data. I have a little sample of code which access which runs over the lines and reads each element on a line.

def get_site_only(pat):
    newpat = ""
    for c in pat:
        if c.isalpha():
            newpat +=c
    return newpat

def read_rebase(filename):
    enz_dict={}
    infh= open("rebase.dat")
    for line in infh.xreadlines():
        fields = line.split()
        name = line.split()[0]
        pat = line.split()[2]
        enz_dict[name] = get_site_only(pat)

    infh.close()
    return enz_dict

print read_rebase("rebase.dat")

I get the following error message:

Traceback (most recent call last):
File "D:\Downloads\Python2\trial3.py", line 20, in <module>
print read_rebase("rebase.dat")
File "D:\Downloads\Python2\trial3.py", line 13, in read_rebase
name = line.split()[0]
IndexError: list index out of range

would you please help me with this problem

chebude

Well the problem is that in your text file you may have an empty line and therefore splitting it will give you a list with nothing in it and therefore if you try and acces the variable at location 0 there is nothing there.
A list index out of range means that the list is not large enough to have a value stored at that location.

Hope that clears it up.

You are right the problem was the empty spaces below my data raws. After removing the empty spaces it runs very well.

Thank you a lot. It was a big help!

To avoid this problem in the future you could do this:

def read_rebase(filename):
    enz_dict={}
    infh= open("rebase.dat")
    for line in infh.xreadlines():
        fields = line.split()
        if len(fields) > 2:
            name = line.split()[0]
            pat = line.split()[2]
            enz_dict[name] = get_site_only(pat)

    infh.close()
    return enz_dict

This way you make sure that the number of elements matches the indices that you're requesting.

This question has already been answered. Start a new discussion instead.