I am a new to python. I need to solve 5 exercises. Any help would be very much appreciated!! thank you!
1.Write a program dotplot.py
That takes as input a fasta file with two sequences
A scoring matrix and sliding window, such as:
dotplot.py fasta.in score.mat 11
and prints on the standard oputput the dotplot.
2.Write your own substitution Matrix:
Given e set of aligned sequences compute the
Pab=frequency of mutations between a->b (assume symmetry,
a->b counts also as b->a).
Pa=as the marginal probability of Pab
Finally, derive the substitution matrix:
s (a,b) = log(Pab/PaPb)
3.Suppose you want only to know the score of a global alignment.
=> Write a program that given two input sequence (in a single file in fasta
format), a gap cost and a similarity matrix computes the score of the global
alignment in O(N*M) time and in O(M) space,
where M and N are the lengths of the input sequences and M<=N
4.Write a program that given two input sequence (in a single file in fasta format), and
a choice of a general gap function and scoring matrix computes the
alignments of the two sequences and returns one of the possible best
Remember that when you store that the best score is obtained using
maxk=0…i-1F(k,j) – g(i-k)
maxk=0…j-1F(i,k) – g(j-k)
You have to store this information in the corresponding pointer (back-trace) matrix.
5.Write a program that takes in input a fasta with two
sequences, and a number N.
Compute the score of the global alignment of the two
sequence and the Z-score with respect N shuffled
sequences (generated from the first of the fasta)
against the original second sequence of the fasta.
S= Alignment score
<S>= average of the scores on a random set of alignments
s Standard deviation of the scores on a random set of