Hey all,

I'm trying to read through a code written by a colleague last year in which he used the BioPython package. The package is no longer on my computer, and I have to wait for the systems admin to update it; thus, I'm forced to merely read the code. What I keep struggling with is this part:

for seq_record in SeqIO.parse(handle, "fasta"): #for a given entry in the new seq_record holding

seq_string = seq_record.seq.tostring();
"""Get the sequence into a string"""
seq_length = len(seq_record.seq); #why seq_record.seq not set_record
"""get the sequence length"""

Exactly why did he need to call "seq_record.seq" instead of just "seq_record". Was that extra .seq necessary? If so, why?

Hey all,

I'm trying to read through a code written by a colleague last year in which he used the BioPython package. The package is no longer on my computer, and I have to wait for the systems admin to update it; thus, I'm forced to merely read the code. What I keep struggling with is this part:

for seq_record in SeqIO.parse(handle, "fasta"): #for a given entry in the new seq_record holding

seq_string = seq_record.seq.tostring();
"""Get the sequence into a string"""
seq_length = len(seq_record.seq); #why seq_record.seq not set_record
"""get the sequence length"""

Exactly why did he need to call "seq_record.seq" instead of just "seq_record". Was that extra .seq necessary? If so, why?

seq_record refers to a sequence in fasta format.
There's a definition line (defline) starting with a '>' plus the sequence name and some info.
Next are the lines of text which are the actual sequence data.

The '.seq' is required so len will look at just the sequence data, not the whole record.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.