I am working on Protein and DNA sequences. When you have a protein sequence it can be translated in to DNA sequence in many ways. A nice way to express this is using regular expression. I would like to create this long regular expression for a protein sequence, and than seach for a smaller string. So that is the other way around of what regular expression does. Is there a way to do this?

I see a way, but there is some work to do. First you convert your regular expression to a NFA (non deterministic finite automaton) with a start state and
an accept state. In this NFA, you add an epsilon-transition from the start state to every other state and an epsilon transition from any state to the accept state.
This NFA should be able to find your subsequences.

A second step could be to convert the NFA into a deterministic automaton (DFA) to produce a fast tool. You could use a program like this one for the conversion (again there is some work, because the NFA in this link doesn't seem to have epsilon transitions).

You can use an NDFA (NFA) where each transition character in your short string triggers a transition in the NFA, if possible. The google knows about how you might want to start doing that.

If you have only not optional alternatives without repeats in [], you could do something like this to avoid complicated transformations (you could improve the matching with more advanced string matching algorithm) :

dna = r'atata[atg]ca[atgc]atatagtagcctgt'
grouping = []
s = set()
reading = False
for acid in dna:
if acid == '[':
s = set()
reading = True
elif acid == ']':
reading = False
grouping.append(s)
elif reading:
s.add(acid)
else:
grouping.append(acid)
print grouping
target = 'tat'
matches = list()
for index in range(len(grouping) - len(target) + 1):
if all((c1 in c2) for c1, c2 in zip(target, grouping[index:])):
g = grouping[index:index+len(target)]
matches.append((index, g))
if matches:
print 'Match at:'
for index, g in matches:
print index,':',g
else:
print 'Match not found'

Need some help with this Array. I am trying to get the sum of the even numbers and the sum of the odd numbers using a for each loop. I know the answers to what I am trying to achive are sum of even = 84 and the sum of ...

When I execute this progammatically, I get a table with row heights much larger than when I do this manually.

Note : Sel is the Word.Selection object and the Clipboard contains an Excel Table.

public void AddClipboard()
{
Sel.PasteExcelTable(false,false, false);
var t = Sel.Tables[Sel.Tables.Count];
t.AutoFitBehavior(Word.WdAutoFitBehavior.wdAutoFitContent);
}

I have a 2d matrix with dimension (3, n) called A, I want to calculate the normalization and cross product of two arrays (b,z) (see the code please) for each column (for the first column, then the second one and so on).
the function that I created to find the ...