Hello, I would like to know if I need to use a regular expression to match the desired substring in order to print out 10 characters of the start codon ATG.
My dna sequence is "CATAGAGATA"
Thanks for any advice.
I don't think I understand the question. Your dna sequence consists of 10 characters and you want to print out 10 characters starting with the substring 'ATG'? I don't see any occurrence of the substring 'ATG' in your sequence. Can we shuffle the dna sequence until it contains (or starts with?) 'ATG'? Please tell us how you would determine the output without using a program and then maybe we can advise how to write a program that does it.
For example, does the following do what you want?
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw(shuffle); #This module includes a method to shuffle arrays.
my $str = "CATAGAGATA";
my @arr;
while (1){
@arr = $str =~ m/[AGCT]/g; #Convert string into array of single letters
@arr = shuffle(@arr); #Shuffle the letters of the array randomly
last if @arr[0,1,2] = qw(A T G)# Exit loop if first 3 elements = start codon
}
print "Shuffled sequence is:\n";
print join('', @arr), "\n";
This outputs:
Shuffled sequence is:
ATGAGTCTAA
d5e5
Practically a Posting Shark
810 posts since Sep 2009
Reputation Points: 159
Solved Threads: 159
The question confuses me too. I still see only 10 bases and I don't see any 'ATG' in the sequence. Whoever gave you this question may have made a mistake.
d5e5
Practically a Posting Shark
810 posts since Sep 2009
Reputation Points: 159
Solved Threads: 159