Hello,

I am a new to python. I need to solve 5 exercises. Any help would be very much appreciated!! thank you!

1.Write a program dotplot.py

That takes as input a fasta file with two sequences

A scoring matrix and sliding window, such as:

dotplot.py fasta.in score.mat 11

and prints on the standard oputput the dotplot.

2.Write your own substitution Matrix:

Given e set of aligned sequences compute the

Pab=frequency of mutations between a->b (assume symmetry,

a->b counts also as b->a).

Pa=as the marginal probability of Pab

Finally, derive the substitution matrix:

s (a,b) = log(Pab/PaPb)

3.Suppose you want only to know the score of a global alignment.

=> Write a program that given two input sequence (in a single file in fasta

format), a gap cost and a similarity matrix computes the score of the global

alignment in O(N*M) time and in O(M) space,

where M and N are the lengths of the input sequences and M<=N

4.Write a program that given two input sequence (in a single file in fasta format), and

a choice of a general gap function and scoring matrix computes the

alignments of the two sequences and returns one of the possible best

alignments.

Remember that when you store that the best score is obtained using

maxk=0…i-1F(k,j) – g(i-k)

maxk=0…j-1F(i,k) – g(j-k)

You have to store this information in the corresponding pointer (back-trace) matrix.

5.Write a program that takes in input a fasta with two

sequences, and a number N.

Compute the score of the global alignment of the two

sequence and the Z-score with respect N shuffled

sequences (generated from the first of the fasta)

against the original second sequence of the fasta.

Z=(S-<S>)/ s

S= Alignment score

<S>= average of the scores on a random set of alignments

s Standard deviation of the scores on a random set of

alignments