User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Perl section within the Software Development category of DaniWeb, a massive community of 392,090 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 3,939 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Perl advertiser:
Views: 3505 | Replies: 27
Reply
Join Date: Feb 2007
Posts: 31
Reputation: MojoS is an unknown quantity at this point 
Rep Power: 2
Solved Threads: 0
MojoS's Avatar
MojoS MojoS is offline Offline
Light Poster

Help have some problems with pattern match hope you can help!!

  #1  
Apr 30th, 2007
Hi there...

I am working one a perl script program and hope someone can help me , I'll give you a quick description of my project:

My program is given a fasta file, a signal description and a deviation (a number) as input om the command line.
A fasta file look like this :
>U00659.CDS.1 product:"insulin GGCCC
CCGCAGAAGCGTGGCATCGTGGAGCAGTGCTGCGCCGGCGTCTGCTCTCTCTACCAGCTG
AAAGACCAGACGGAGATGATGGTAAAGAGAGGTATTGTAGA
>X13559.CDS.1 product:"preproinsulin " DNA org:"Oncorhynchus keta" (CDS extraction)
ATGGCCTTCTGGCTCCAAGCTGCATCTCTGCTGGTGTTGCTGGCGCTCTCCCCCGGGGTA
GATGCTGCAGCTGCCCAGCACCTGTGTGGCTCTCACCTGGTGGACGCCCTCTATCTGGTG
TGTGGAGAGAAAGGATT
>J02989.CDS.1 note:"preproinsulin " DNA org:"Aotus trivirgatus" (CDS extraction)
ATGGCCCTGTGGATGCACCTCCTGCCCCTGCTGGCGCTGCTGGCCCTCTGGGGACCCGAG
CCAGCCCCGGCCTTTGTGAACCAGCACCTGTGCGGCCCCCACCTGGTGGAAGCCCTCTAC
CTGGTGTGCGGGGAGCGAGGTTTC
The first line of a FASTA file is a header and begins with '>', thise line should be ignored
the main thing is the sequence (ATCGCGCTATA)hoe i want to match..

A Signal description file is a text file that look like this:

# Shine-Delgarno
T 7
T 8
G 6
A 5
C 5
A 5
# intervening unimportant bases
* 15-21
# Pribnow box
T 8
A 8
T 6
A 6
AT 5
T 8

1) one or more allowed letters at this position and a penalty
for having a mismatch at that position.

2) the star character * denoting unimportant characters in the sequence and an interval where these
unimportant characters are allowed.

3) the hash character # meaning this line is a comment, and should be ignored by the program.


Okay now to the main thing, the output should list all matches in each fasta entry, clearly stating the location of the match.

The deviation is an important factor. If the deviation is set to 0, then it should search for the signal
is reduced to a regular expression. If the deviation is set to 16 in the above example,
then mismatches with the combined penalty of 16 or less are allowed.

I have try this so far but i cant figure out how to used tha deviation number and set the patternmatch I am pretty lost:


#!/usr/bin/perl -w

use strict;
#############
# Step 1 #
#############
#The program is given fasta file, a signal description file and a deviaton number as input on the command line comments if there are erros:
#Erros: be sured that deviation is a number


sub usage {
my ($msg) = @_;
print "$msg\n\n" if defined $msg;
print "Usage: project.pl <fastafile.fsa> <signaldescriptionfile.txt> <deviation>\n";
exit;
}
if (scalar @ARGV !=3){
&usage("Wrong number of arguments");
}

my ($fastafile, $signaldescription, $deviation) = @ARGV ;

if ($deviation =~ m/^\d+$/){ #correct input
print "Thanks!\n";
}else{
&usage ("I want a number please!");
}

################
# Step 2 #
################
# working with signal description:
#read the file and insure to put penalty and character in two seperate arrays,
#the # should be ignored
#the * unimportant sequence and should be ignored at position 15 -21 (have figure that yet):

open(IN,'<',$signaldescription ) or die "Could not find file\n";
my @character = ();
my @penalty = ();
my $comment ='';
while (defined (my $line = <IN>)) {
chomp ($line);
if ($line =~ m/^#/) {
if ($comment ne ''){
my ($character, $penalty) = split (' ',$line);
push @character, $character;
push @penalty, $penalty;
}
}
}


close IN;


############
# Step 3 #
############
#work with fasta file:
# Use regular Expresions to look at the fasta file and ignore the first line:


# $fragment: the pattern to search for
# $fraglen: the length of $fragment
# $buffer: a buffer to hold the DNA from the input file
# $position: the position of the buffer in the total DNA

my($fragment, $fraglen, $buffer, $position) = (@karaktere, '', 0);

my ($headline, $line, $dna) = ('', '', '');

open(IN, '>', $fastafilename) or die "Could not read file ($fastafile)\n";

# The first line of a FASTA file is a header and begins with '>'

while (defined ($line = <IN>)) {
if ($line =~ m/^>/) {
if ($headline ne '') { #after the sequence is readed i wanna look for the match

#write data to file (the matches):
chomp $headline;
print OUT "$headline\n";
for (my $i = 0; $i < length($reversecomplementdna); $i += 60) {
print OUT substr($reversecomplementdna, $i, 60), "\n";
}
# Get ready for next turn in the loop
$dna = '';+
}
$headline = $line;
}
else {
# Read the DNA
chomp $line;
$dna .= $line;
}
}
#########################

Thanks alot for your time, i really apriciet your time and if you can help me....

thanxxxx
MojoS
AddThis Social Bookmark Button
Reply With Quote  
Join Date: Mar 2006
Posts: 584
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 30
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: have some problems with pattern match hope you can help!!

  #2  
Apr 30th, 2007
Is this school work?
Reply With Quote  
Join Date: Feb 2007
Posts: 31
Reputation: MojoS is an unknown quantity at this point 
Rep Power: 2
Solved Threads: 0
MojoS's Avatar
MojoS MojoS is offline Offline
Light Poster

Re: have some problems with pattern match hope you can help!!

  #3  
May 1st, 2007
Yes its one of my project that I am working on at my uni, and i will appreciate if someone can help me out by giving me some comments and ideas on how to work it out, because Iam pretty lost...

Thanxx
Reply With Quote  
Join Date: Mar 2006
Posts: 584
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 30
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: have some problems with pattern match hope you can help!!

  #4  
May 1st, 2007
First thing you have to do is fix the errors:

Global symbol "@karaktere" requires explicit package name at script line 68.
Global symbol "$fastafilename" requires explicit package name at script line 72.
Global symbol "$reversecomplementdna" requires explicit package name at script line 83.
Global symbol "$reversecomplementdna" requires explicit package name at script line 84.
syntax error at script line 88, near "}"

How come you are not getting help from a teacher or fellow student?
Reply With Quote  
Join Date: Feb 2007
Posts: 31
Reputation: MojoS is an unknown quantity at this point 
Rep Power: 2
Solved Threads: 0
MojoS's Avatar
MojoS MojoS is offline Offline
Light Poster

Re: have some problems with pattern match hope you can help!!

  #5  
May 1st, 2007
Thanks for looking throught it; unfortunately my professor is not always available and it would take some time before I could go on with this without any advice(because I'm stuck). As I'm doing this project on my own I don't have any fellowstudents to aks or exchange views with.
But thanks anyway
Reply With Quote  
Join Date: Mar 2006
Posts: 584
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 30
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: have some problems with pattern match hope you can help!!

  #6  
May 1st, 2007
OK, well, fix those errors otherwise your code will not even compile.
Reply With Quote  
Join Date: Feb 2007
Posts: 31
Reputation: MojoS is an unknown quantity at this point 
Rep Power: 2
Solved Threads: 0
MojoS's Avatar
MojoS MojoS is offline Offline
Light Poster

Re: have some problems with pattern match hope you can help!!

  #7  
May 3rd, 2007
okay I have fix it thanks!
Reply With Quote  
Join Date: Feb 2007
Posts: 31
Reputation: MojoS is an unknown quantity at this point 
Rep Power: 2
Solved Threads: 0
MojoS's Avatar
MojoS MojoS is offline Offline
Light Poster

Re: have some problems with pattern match hope you can help!!

  #8  
May 3rd, 2007
Okay but can you advice me how to work with pattern match, if i want to ignore some character at a specific position.....

signaldescription:
# Shine-Delgarno
T 7
T 8
G 6
A 5
C 5
A 5
* 15-21 # intervening unimportant bases
# Pribnow box
T 8
A 8
T 6
A 6
AT 5
T 8


open(IN,'<',$signaldescription ) or die "Could not find file\n";
my @character = ();
my @penalty = ();
my $comment ='';
while (defined (my $line = <IN>)) {
chomp ($line);
if ($line =~ m/^#/) {
if ($comment ne ''){
my ($character, $penalty) = split (' ',$line);
push @character, $character;
push @penalty, $penalty;
}
}
}


close IN;


Here I have tried to save the penalty that contain the characters in an array and the penalties that contain number in another array but I cant figure out how to perlscript my program to skip a certain position when it meet the character *...

thanks..
Reply With Quote  
Join Date: Mar 2006
Posts: 584
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 30
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: have some problems with pattern match hope you can help!!

  #9  
May 3rd, 2007
If have the time I might try and help. Your question, or questions, are really more than asking for general help and will take some time to try and assist you with.
Reply With Quote  
Join Date: Mar 2006
Posts: 584
Reputation: KevinADC is an unknown quantity at this point 
Rep Power: 4
Solved Threads: 30
KevinADC's Avatar
KevinADC KevinADC is offline Offline
Posting Pro

Re: have some problems with pattern match hope you can help!!

  #10  
May 3rd, 2007
what is $comment being used for?

my $comment ='';

further on you have:

if ($comment ne ''){

but $comment is blank ('') so that condition is always false so the expressions that follow are never evaluated.
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 

DaniWeb Perl Marketplace
Thread Tools Display Modes

Other Threads in the Perl Forum

All times are GMT -4. The time now is 12:37 pm.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC