| | |
Gene Sequence Problem
![]() |
•
•
Join Date: Oct 2008
Posts: 2
Reputation:
Solved Threads: 0
I am fairly new to Python and trying towork on this problem. I want to split the file which contains two seuqence of letters by the blank line that separates them then 'compare' them:
What you need to do: Text file genesequences.txt contains two gene sequences, separated
from each other by an empty line. Write a program that will read the gene sequences in
(make sure to discard the ‘\n’ characters when you read in the gene sequences), and find the longest region that is shared between the sequences that is also homozygous (has “A” and“B” but no “C”). You may assume that the shared regions will be in the same location in
both gene sequences, so you will only have to check regions starting at the same location in
both sequences.
I think you will want to read and store both gene sequences in two variables, and
discard any extraneous characters but A, B and C. (ii) Use a window that starts with a size of1, but increases by 1 for each iteration, and goes upto length of the entire string. In each
iteration, check window-sized regions of the two sequences. If a match is found, and it does
not contain a ‘C’, note the location and length of the match. (iii) Note that if you use an
increasing window size, any subsequent match will be bigger than previous matches.)
Any ideas? Thanks.
What you need to do: Text file genesequences.txt contains two gene sequences, separated
from each other by an empty line. Write a program that will read the gene sequences in
(make sure to discard the ‘\n’ characters when you read in the gene sequences), and find the longest region that is shared between the sequences that is also homozygous (has “A” and“B” but no “C”). You may assume that the shared regions will be in the same location in
both gene sequences, so you will only have to check regions starting at the same location in
both sequences.
I think you will want to read and store both gene sequences in two variables, and
discard any extraneous characters but A, B and C. (ii) Use a window that starts with a size of1, but increases by 1 for each iteration, and goes upto length of the entire string. In each
iteration, check window-sized regions of the two sequences. If a match is found, and it does
not contain a ‘C’, note the location and length of the match. (iii) Note that if you use an
increasing window size, any subsequent match will be bigger than previous matches.)
Any ideas? Thanks.
Last edited by Cali45; Nov 8th, 2008 at 6:34 pm.
Which part are you struggling with?
Here's how you open a file:
Here's how you open a file:
python Syntax (Toggle Plain Text)
fh = open( 'file.txt' )
There are many ways to do it; however if it were me this is the approach I would take:
The file method
After closing the file handle, I used a "list comprehension" to iterate through each element of
Finally I just made sure that I ended up with three variables by checking the length of my lines container. If it contains more or less than three elements, I print it out so that I can check what the contents of the file were. The
HTH
python Syntax (Toggle Plain Text)
fh = open( 'myfile.txt' ) lines = fh.readlines() fh.close() my_lines = [ line.strip() for line in lines if line.strip() ] if len( my_lines ) == 3: var1, var2, var3 = my_lines else: print 'File contains incorrect data:' print ''.join( lines )
readlines() returns a list containing each line in a file as an element.After closing the file handle, I used a "list comprehension" to iterate through each element of
lines (each line), strip() off any leading/trailing whitespace (such as \n newline seperators) and then store them if they contained any characters.Finally I just made sure that I ended up with three variables by checking the length of my lines container. If it contains more or less than three elements, I print it out so that I can check what the contents of the file were. The
join() method simply joins each line back together into a string so that you can read it easier when printing.HTH
Last edited by jlm699; Nov 10th, 2008 at 10:10 am.
![]() |
Similar Threads
Other Threads in the Python Forum
- Previous Thread: SOAP Web Services with WSDL
- Next Thread: Validating user input against text file
| Thread Tools | Search this Thread |
alarm ansi assignment avogadro backend beginner binary bluetooth character cmd code customdialog cx-freeze data decimals dictionary directory dynamic error examples exe file float format function generator gnu graphics gui halp heads homework http ideas import input itunes java leftmouse line linux list lists loop maze module mouse number numbers output parsing path pointer port prime programming progressbar projects push py2exe pygame pyglet pyqt python random recursion schedule screensaverloopinactive script scrolledtext slicenotation sqlite ssh statistics string strings sudokusolver sum terminal text thread threading time tlapse tricks tuple tutorial ubuntu unicode urllib urllib2 variable ventrilo vigenere web webservice wikipedia write wxpython xlib






