So have a variable that, when set, it deletes each line it encounters until it finds one starting with a >
Then just have this set to True or something once you find a line you want to remove, and it'll remove the lines after that don't start with > (which would the lines with the sequence). Then just set it to False once it encounters a line starting with > again.
Sorry if that wasn't too clear :P This is what I meant:
from __future__ import with_statement
with open ('dna.txt') as fil:
f = fil.readlines()
delete_seq = False
for line in f:
if line[0] == ">":
delete_seq = False
if "rs" in line:
delete_seq = True
elif not delete_seq:
print line,
It will set delete_seq to True if it finds an "rs" in the line, and while delete_seq is True, it'll ignore any following lines until one of them starts with ">", which will set it back to False. If you need me to clarify, just ask. Here's my output:
>1|100159271|ENSRNOSNP145|T/A||ENSEMBL:celera|T/A
TCTTATAATTAGTCATTGTGATAACTGCTACAAACAAAGTCACAGGATCTTGTGAGAGAA
>1|101456015|ENSRNOSNP1318|G/C||ENSEMBL:celera|G/C
AACTCTTAGAAGTTAGAACCTGGGGTGGAGAGATGGCTTGGTGGTTGAGAGCATTGACTG