Hi everybody!! I think this is a simplest perl related problem..but still I need your help.
Here's my sample input file:


Please note that the number of lines could be more than these two, but the Perl script should skip the first line which starts with '>'.
Now the Perl script should take multiple lines as a single line and check if the line starts with ATG and ends with TAT. If this condition is true, then the output should be "gene". Else "not gene".
But my perl script is not taking the whole file. It is taking one line at a time. Here's my script:


print "Print your file name with location\n";
chomp $dnafile;

open (DNA, $dnafile) || die "Cannot open the file : $!";

while ($dna=<DNA>)
	chomp ($dna);
	### Check Starting not equals to '>' letter
	if ($dna=~/^[^>]/)
		@dna=split ('', $dna);
		print "$dna";
  if (($dna=~/^ATG/) && ($dna=~/TAT$/)) {
      print "gene";
  else {
  print "Not gene\n";

Please let me know how can I improve it?

But my perl script is not taking the whole file.

undef $/;  # input record Separator 
open (FILEHANDLE, "$input_file") || die "Cannot Open the $input_file : $!";
my $file_content = <FILEHANDLE>;
print $file_content;


open (FILEHANDLE, "$input_file") || die "Cannot Open the $input_file : $!";
read FILEHANDLE, my $file_content, -s FILEHANDLE;
print $file_content;

hii thanks..so u mean that I have to add this piece of code before the while loop?

Hi Ghosh,

Read the below links and try the updated code.

File Handling
File Contents
Regular Expression
and more

open (FIN, "$input_file") || die "Cannot Open the $input_file : $!";
read FIN, my $file, -s FIN;
close (FIN);

if ($file =~ m{
		^	 # Match Begining
		>	 # match '>' char
		[^\n]+\n  # Caputred the first line
		ATG.*TAT  # Match char 'ATG' followed any characters and 'TAT'
		$	 # Match End 
	print "\nGene";
	print "\nNot Gene";

k_manimuthu's answers should work fine. Here is a slightly different way to do the same thing.

use strict;
use warnings;

my $input_file = 'blast.txt';

open my $fh, '<', $input_file or die "Cannot Open the $input_file : $!";

my $sequence;
while (<$fh>){
    $sequence .= $_ unless m/^>/;#Skip the line that starts with >

print $sequence, "\n";

if ($sequence =~ /^ATG.*TAT$/){
    print "The above sequence starts with ATG and ends with TAT, so it's a gene.";
    print "The above sequence is not a gene.";
close $fh;

This gives the following output:

The above sequence starts with ATG and ends with TAT, so it's a gene.

hi thanks..would u plz let me know the meaning of .= in line 12?

it is a concat statment. It means....

$sequence = $sequence . $_;
## Another one example
$first_name = 'Mani';
$last_name  = 'Muthu';
$full_name  = "$first_name". " " . "$last_name";
print $full_name;

oh..gr8!! thnks..

Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.18 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.