Hi folks,

I am a newbie to python, and I would be grateful if someone could
point out the mistake in my program. Basically, I have a huge text
file similar to the format below:

AAAAAGACTCGAGTGCGCGGA 0
AAAAAGATAAGCTAATTAAGCTACTGG 0
AAAAAGATAAGCTAATTAAGCTACTGGGTT 1
AAAAAGGGGGCTCACAGGGGAGGGGTAT 1
AAAAAGGTCGCCTGACGGCTGC 0

The text is nothing but DNA sequences, and there is a number next to
it. What I will have to do is, ignore those lines that have 0 in it,
and print all other lines (excluding the number) in a new text file
(in a particular format called as FASTA format). This is the program I
wrote for that:

seq1 = []
list1 = []
lister = []
listers = []
listers1 = []
a = []
d = []
i = 0
j = 0

file1 = open(sys.argv[1], 'r')
for line in file1:
   if not line.startswith('\n'):
       seq1 = line.split()
       if len(seq1) == 0:
           continue

       a = seq1[0]
   	list1.append(a)

   	d = seq1[1]
   	lister.append(d)


b = len(lister)
for j in range(0, b):
   if lister[j] == 0:
       listers.append(j)
   else:
       listers1.append(j)


resultsfile = open("sequences1.txt", 'w')
for i in listers1:
   resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n')

But this isn't working. I am not able to find the bug in this. I would
be thankful if someone could point it out. Thanks in advance!

Cheers!

See if this help make your code shoter and clearer.
Ask if somethis is unclear.

l = []
for i in open('dna.txt'):
    if i.split()[1] == '1':             
         l.append(i.strip().rstrip('1'))       
print l

'''Out
['AAAAAGATAAGCTAATTAAGCTACTGGGTT ', 'AAAAAGGGGGCTCACAGGGGAGGGGTAT ']
'''

Edited 6 Years Ago by snippsat: n/a

This article has been dead for over six months. Start a new discussion instead.