Pairwise Comparison Of DNA Sequences function- stuck and cant find why
The exercise says the following:
"""
Pairwise comparision of DNA sequences is a popular technique used in Bioinformatics. It usually involves some scoring scheme to express the degree of similarity. Write a function that compares two DNA sequences based on the following scoring scheme: +1 for a match, +3 for each consecutive match and -1 for each mismatch.
Examples
>>> print pairwiseScore("ATTCGT", "ATCTAT")
ATTCGT
|| |
ATCTAT
Score: 2
>>> print pairwiseScore("GATAAATCTGGTCT", "CATTCATCATGCAA")
GATAAATCTGGTCT
|| ||| |
CATTCATCATGCAA
Score: 4
>>>
"""
so i wrote the following, which was continuously working until i started playing with it, now i cant find whats wrong with it! please help me by pointing out where i went wrong?
def pariwiseScore(seqA, seqB):
h = []
c = []
for x in range(0,len(seqA)):
if seqA[x] == seqB[x]:
h.append('|')
else:
h.append(' ')
if len(h) > 1:
if h[x] == '|':
try:
if h[x] == h[x-1]:
c.append(3)
except IndexError:
c.append(1)
elif h[x] == ' ':
c.append(-1)
else:
if h[x] == '|':
c.append(1)
elif h[x] == ' ':
c.append(-1)
a = "".join(h)
return seqA, '\n', a, '\n', seqB, '\n', 'Score: %I' % sum(c)
Any help is much appreciated.
pwolf
Junior Poster in Training
71 posts since Dec 2011
Reputation Points: 10
Solved Threads: 0
P.s i had it before and it was returning the correct sum, but i copied it into a notepad and accidentally closed it. Now i cant get it working
pwolf
Junior Poster in Training
71 posts since Dec 2011
Reputation Points: 10
Solved Threads: 0
noticed my problem, renamed it to pariwise and was still testing for pairwise, also flawed format specifier
pwolf
Junior Poster in Training
71 posts since Dec 2011
Reputation Points: 10
Solved Threads: 0
its still not returning right, why is it that when i use print it works fine but when return and print the results of the function it wont?
def pairwiseScore(seqA, seqB):
h = []
c = []
for x in range(0,len(seqA)):
if seqA[x] == seqB[x]:
h.append('|')
else:
h.append(' ')
if len(h) > 1:
if h[x] == '|':
if h[x] == h[x-1]:
c.append(3)
else:
c.append(1)
elif h[x] == ' ':
c.append(-1)
else:
if h[x] == '|':
c.append(1)
elif h[x] == ' ':
c.append(-1)
a = "".join(h)
return seqA, '\n', a, '\n', seqB, '\n', 'Score: %d' % sum(c)
is my code and it results as follows:
>>> print pairwiseScore("ATTCGT", "ATCTAT")
('ATTCGT', '\n', '|| |', '\n', 'ATCTAT', '\n', 'Score: 2')
how can i remedy this?
pwolf
Junior Poster in Training
71 posts since Dec 2011
Reputation Points: 10
Solved Threads: 0
Try
return "".join((seqA, '\n', a, '\n', seqB, '\n', 'Score: %d' % sum(c)))
edit: don't use tabs to indent python code, use 4 spaces.
Gribouillis
Posting Maven
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
Try
return "".join((seqA, '\n', a, '\n', seqB, '\n', 'Score: %d' % sum(c)))
edit: don't use tabs to indent python code, use 4 spaces.
Thanks, very helpful! i wish i wasn't so bad at this stuff, usually im a fairly quick learner but i guess im just not suited to programming!
pwolf
Junior Poster in Training
71 posts since Dec 2011
Reputation Points: 10
Solved Threads: 0
You seem to have spaces between the parts of answer, probably answer should be direct concatenation of parts.
pyTony
pyMod
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
You are not returning value, so value of your function will be None, also the result does not correspond to the test print when I run the code, first test wrong and hangup for second.
ATTCGT
|| |
ATCTAT
Score: 2
None
Here my little advanced way:
def pairwiseScore(seqA, seqB):
matches = ''.join('|' if a==b else ' ' for a,b in zip(seqA, seqB))
# count does not deal with overlapping matches so must use zip for two extra points for consecutive matches
score = matches.count('|') - matches.count(' ') + 2 * sum( a==b=='|' for a,b in zip(matches, matches[1:]))
return '\n'.join((seqA, matches, seqB, 'Score: %i' % score))
print(pairwiseScore("ATTCGT", "ATCTAT"))
print('')
print(pairwiseScore("GATAAATCTGGTCT", "CATTCATCATGCAA"))
pyTony
pyMod
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852