| | |
Gene Sequence Problem
![]() |
•
•
Join Date: Oct 2008
Posts: 2
Reputation:
Solved Threads: 0
I am fairly new to Python and trying towork on this problem. I want to split the file which contains two seuqence of letters by the blank line that separates them then 'compare' them:
What you need to do: Text file genesequences.txt contains two gene sequences, separated
from each other by an empty line. Write a program that will read the gene sequences in
(make sure to discard the ‘\n’ characters when you read in the gene sequences), and find the longest region that is shared between the sequences that is also homozygous (has “A” and“B” but no “C”). You may assume that the shared regions will be in the same location in
both gene sequences, so you will only have to check regions starting at the same location in
both sequences.
I think you will want to read and store both gene sequences in two variables, and
discard any extraneous characters but A, B and C. (ii) Use a window that starts with a size of1, but increases by 1 for each iteration, and goes upto length of the entire string. In each
iteration, check window-sized regions of the two sequences. If a match is found, and it does
not contain a ‘C’, note the location and length of the match. (iii) Note that if you use an
increasing window size, any subsequent match will be bigger than previous matches.)
Any ideas? Thanks.
What you need to do: Text file genesequences.txt contains two gene sequences, separated
from each other by an empty line. Write a program that will read the gene sequences in
(make sure to discard the ‘\n’ characters when you read in the gene sequences), and find the longest region that is shared between the sequences that is also homozygous (has “A” and“B” but no “C”). You may assume that the shared regions will be in the same location in
both gene sequences, so you will only have to check regions starting at the same location in
both sequences.
I think you will want to read and store both gene sequences in two variables, and
discard any extraneous characters but A, B and C. (ii) Use a window that starts with a size of1, but increases by 1 for each iteration, and goes upto length of the entire string. In each
iteration, check window-sized regions of the two sequences. If a match is found, and it does
not contain a ‘C’, note the location and length of the match. (iii) Note that if you use an
increasing window size, any subsequent match will be bigger than previous matches.)
Any ideas? Thanks.
Last edited by Cali45; Nov 8th, 2008 at 6:34 pm.
Which part are you struggling with?
Here's how you open a file:
Here's how you open a file:
python Syntax (Toggle Plain Text)
fh = open( 'file.txt' )
There are many ways to do it; however if it were me this is the approach I would take:
The file method
After closing the file handle, I used a "list comprehension" to iterate through each element of
Finally I just made sure that I ended up with three variables by checking the length of my lines container. If it contains more or less than three elements, I print it out so that I can check what the contents of the file were. The
HTH
python Syntax (Toggle Plain Text)
fh = open( 'myfile.txt' ) lines = fh.readlines() fh.close() my_lines = [ line.strip() for line in lines if line.strip() ] if len( my_lines ) == 3: var1, var2, var3 = my_lines else: print 'File contains incorrect data:' print ''.join( lines )
readlines() returns a list containing each line in a file as an element.After closing the file handle, I used a "list comprehension" to iterate through each element of
lines (each line), strip() off any leading/trailing whitespace (such as \n newline seperators) and then store them if they contained any characters.Finally I just made sure that I ended up with three variables by checking the length of my lines container. If it contains more or less than three elements, I print it out so that I can check what the contents of the file were. The
join() method simply joins each line back together into a string so that you can read it easier when printing.HTH
Last edited by jlm699; Nov 10th, 2008 at 10:10 am.
![]() |
Similar Threads
Other Threads in the Python Forum
- Previous Thread: SOAP Web Services with WSDL
- Next Thread: Validating user input against text file
| Thread Tools | Search this Thread |
alarm anydbm app assignment beginner bluetooth character cipher cmd conversion coordinates corners curves customdialog cx-freeze data decimals definedlines development directory events excel exe feet file float format function generator getvalue gnu halp handling homework http ideas input ip itunes keycontrol leftmouse line linux list lists loan loop maintain maze millimeter module mouse number numbers output parsing path prime programming push py2exe pygame pymailer python queue random rational raw_input recursion recursive schedule screensaverloopinactive script searchingfile slicenotation sqlite ssh string strings sudokusolver text time tlapse tooltip tuple type ubuntu unicode url urllib urllib2 variable ventrilo vigenere web webservice wikipedia wxpython xlib xlwt






