Gene Sequence Problem

Reply

Join Date: Oct 2008
Posts: 2
Reputation: Cali45 is an unknown quantity at this point 
Solved Threads: 0
Cali45 Cali45 is offline Offline
Newbie Poster

Gene Sequence Problem

 
0
  #1
Nov 8th, 2008
I am fairly new to Python and trying towork on this problem. I want to split the file which contains two seuqence of letters by the blank line that separates them then 'compare' them:

What you need to do: Text file genesequences.txt contains two gene sequences, separated
from each other by an empty line. Write a program that will read the gene sequences in
(make sure to discard the ‘\n’ characters when you read in the gene sequences), and find the longest region that is shared between the sequences that is also homozygous (has “A” and“B” but no “C”). You may assume that the shared regions will be in the same location in
both gene sequences, so you will only have to check regions starting at the same location in
both sequences.

I think you will want to read and store both gene sequences in two variables, and
discard any extraneous characters but A, B and C. (ii) Use a window that starts with a size of1, but increases by 1 for each iteration, and goes upto length of the entire string. In each
iteration, check window-sized regions of the two sequences. If a match is found, and it does
not contain a ‘C’, note the location and length of the match. (iii) Note that if you use an
increasing window size, any subsequent match will be bigger than previous matches.)


Any ideas? Thanks.
Last edited by Cali45; Nov 8th, 2008 at 6:34 pm.
Reply With Quote Quick reply to this message  
Join Date: Jul 2008
Posts: 1,009
Reputation: jlm699 is a jewel in the rough jlm699 is a jewel in the rough jlm699 is a jewel in the rough jlm699 is a jewel in the rough 
Solved Threads: 244
Sponsor
jlm699's Avatar
jlm699 jlm699 is offline Offline
Knows where his Towel is

Re: Gene Sequence Problem

 
0
  #2
Nov 9th, 2008
Which part are you struggling with?

Here's how you open a file:
  1. fh = open( 'file.txt' )
1. Use Code Tags.
2. Homework? Show Effort.
3. Keep discussions on the forum: no PMs
Reply With Quote Quick reply to this message  
Join Date: Oct 2008
Posts: 2
Reputation: Cali45 is an unknown quantity at this point 
Solved Threads: 0
Cali45 Cali45 is offline Offline
Newbie Poster

Re: Gene Sequence Problem

 
0
  #3
Nov 9th, 2008
how do i split a file that has 3 separate sequences all divided by a blank line-essentially i want each sequence in a different variable
Reply With Quote Quick reply to this message  
Join Date: Jul 2008
Posts: 1,009
Reputation: jlm699 is a jewel in the rough jlm699 is a jewel in the rough jlm699 is a jewel in the rough jlm699 is a jewel in the rough 
Solved Threads: 244
Sponsor
jlm699's Avatar
jlm699 jlm699 is offline Offline
Knows where his Towel is

Re: Gene Sequence Problem

 
0
  #4
Nov 10th, 2008
There are many ways to do it; however if it were me this is the approach I would take:
  1. fh = open( 'myfile.txt' )
  2. lines = fh.readlines()
  3. fh.close()
  4.  
  5. my_lines = [ line.strip() for line in lines if line.strip() ]
  6. if len( my_lines ) == 3:
  7. var1, var2, var3 = my_lines
  8. else:
  9. print 'File contains incorrect data:'
  10. print ''.join( lines )
The file method readlines() returns a list containing each line in a file as an element.

After closing the file handle, I used a "list comprehension" to iterate through each element of lines (each line), strip() off any leading/trailing whitespace (such as \n newline seperators) and then store them if they contained any characters.

Finally I just made sure that I ended up with three variables by checking the length of my lines container. If it contains more or less than three elements, I print it out so that I can check what the contents of the file were. The join() method simply joins each line back together into a string so that you can read it easier when printing.

HTH
Last edited by jlm699; Nov 10th, 2008 at 10:10 am.
1. Use Code Tags.
2. Homework? Show Effort.
3. Keep discussions on the forum: no PMs
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the Python Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC