for seq_record in SeqIO.parse(open("mm.fa"), "fasta") :

I am using the above code to read fasta files. But when the number of sequences record exceed about 30000, the python return this error:

Traceback (most recent call last):
File "C:\Python26\Neutral\AICMM.py", line 28, in <module>
if (seq_record.seq[count] == "A") :
File "C:\Python26\lib\site-packages\Bio\Seq.py", line 157, in __getitem__
return self._data[index]
IndexError: string index out of range

Anyone who can help me with this?
Thank you very much

7 Years
Discussion Span
Last Post by woooee

I haven't worked with fasta files nor do I know what they are.
But have you tried putting it in a try, except statement?

  for seq_record in SeqIO.parse(open("mm.fa"), "fasta") :
    do something
except IndexError:
  pass #or break

Or maybe append the values you get from the file to a list, and work with the items of the list instead?


if (seq_record.seq[count] == "A") :

There is apparently a 32K limit on the number of records (short integer, signed). You will have to come up with a way to split the data up or use another method to read.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.