Hi,

I need help to make a perl program work. The program accepts an input (reaction) and then search for the input in a file and then displays the reaction on the same line.

Input file - file.txt (A large file with no header and in the following format):
A1_HTTT24 : GLUC_ext = GLUC .
C2_GLH3 : GLUC + ATP = GLUC6P + ADP .
B3_PGAI1 : GLUC6P = FRUC6P .

Search example: I want to search with, for example, A1_HTTT24, and then get an output as: GLUC_ext = GLUC


Here's what I have done so far:

$database = 'reaction_database.txt';
  open(Dbase,"<$database") or die "can't open $database $!";

  while (my $line = <Dbase>){     
  chomp $line;
  my ($key, $value) = split(/s+:s+/, $line, 2);
  chomp $key;
  chomp $value;
  $key =~ /([^s]+)/;
  $key = $1; $value =~ /(^s+)(.+)(s.)/;
  $value = $2;
  } 
  close(Dbase);

  open (DATA,"+>data.txt") or die "Can't open data $!";
 # do { 

  print "reaction name for searching: \n";
  $input = <STDIN>;
  chomp $input; 
  while (defined<$line>){ 
       foreach $k (@value){ 
                if ($key eq $input){ 
                print DATA "$key,$value{$key}\n"; 
                   }else{ 
                      print "$input not found!\n";     
                 } 
      }  
   
  }
  close(DATA); 

 # }until ($input =~ /^\s*$/);
  exit;

Thanks,

Jamie

did you mean to save the key and value pairs you read in so you can search them when you open the second file?

Yes.

did you mean to save the key and value pairs you read in so you can search them when you open the second file?

I don't see where you're storing them.

Did you possibly mean to declare a hash?

#declaration
my %reacts;
# store one
$reacts{$key} =$value;
# test to see if one exists
if exists $reacts{$testkey} { ... }

Declare before the read loop
add a store between original lines 11 and 12
use the test in place of original line 21's while loop

Try that and if it doesn't work, post the new code and we'll look at it.

Here's my test code for the read and store portions...you still need to write the retrieval code. If you have any questions about the code, please ask.

my %reacts;
$database = "reaction_database.txt";
open(Dbase,"<$database") or die "can't open $database $!";

# Simplified read loop
# Sample input data
#A1_HTTT24 : GLUC_ext = GLUC .
while (my $line = <Dbase>) {
	# Regex [whitespace]KEY[whitespace]:[whitespace]VALUE[whitespace].
	# optional whitespace, key, optional whitespace, required ':', optional whitespace, value, optional whitespace, required '.'
	$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
	$reacts{$1} = $2;
}

close(Dbase);

#debug to make sure I read my data file correctly
#    I only had the three lines -- you may want to skip this if you have a lot of lines
while ( my ($kk, $vv) = each(%reacts) ) {
	print "$kk => $vv\n";
}

Hi Murtan,

Thanks for your help. I was able to follow the code snippet and the tester. What I have now (below) is still not working. I would be happy to recive further advice.

use strict; 
use warnings; 
 
my $database = 'file.txt'; 
my (%hash,$input); 
open(Dbase,"<$database") or die "can't open $database $!"; 
while (my $line = <Dbase>){ 
chomp $line;  
$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
	#$reacts{$1} = $2;
my ($key,$val)=split('\s*:\s*',$line);
#$key =~ /([^\s]+)/;
$key = $1;
#$val =~ /(^\s+)(.+)(\s\.)/;
$val = $2;
$hash{$key}=$val; 
} 
close Dbase; 

open (DATA,"+>data.txt") or die "Can't open data"; 
do { 
	print "Enter reaction name for searching:"; 
	$input=<>; 
	chomp($input); 
	if(exists $hash{$input}) { 
               print DATA "$input,$hash{$input}\n"; 
	} 
	else { 
               print "Given Key isn't found in the file\n"; 
	} 
}until($input=~/^\s*$/); 
close(DATA);

Here's my test code for the read and store portions...you still need to write the retrieval code. If you have any questions about the code, please ask.

my %reacts;
$database = "reaction_database.txt";
open(Dbase,"<$database") or die "can't open $database $!";

# Simplified read loop
# Sample input data
#A1_HTTT24 : GLUC_ext = GLUC .
while (my $line = <Dbase>) {
	# Regex [whitespace]KEY[whitespace]:[whitespace]VALUE[whitespace].
	# optional whitespace, key, optional whitespace, required ':', optional whitespace, value, optional whitespace, required '.'
	$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
	$reacts{$1} = $2;
}

close(Dbase);

#debug to make sure I read my data file correctly
#    I only had the three lines -- you may want to skip this if you have a lot of lines
while ( my ($kk, $vv) = each(%reacts) ) {
	print "$kk => $vv\n";
}

Edited 6 Years Ago by perly: format code

I don't think you're adding what you think you are. I think the regex match inside the split is overwriting $1 and $2 before you save them.

I would tend to declare $key and $val outside the loop on line 5, and then comment out the split on line 11 to make sure $1 and $2 are preserved -- you don't use the results from the split anyway.

I think that will work.

For debug, you might print all of the key values you add to the hash or with a little more work, you could just print the first ten added or some other subset of the whole.

I haave amended the code as suugested (below). But I received error messages, and although I was able to run the program, but it didn't give the expected output.

Error message:
Use of uninitialized value in hash element at efmReactions.pl line 18, <Dbase> l
ine 1.
Use of uninitialized value in hash element at efmReactions.pl line 18, <Dbase> l
ine 2.
Use of uninitialized value in hash element at efmReactions.pl line 18, <Dbase> l
ine 3.
Enter reaction name for searching:R1_HXT124
Given Key isn't found in the file
Enter reaction name for searching:

use strict; 
use warnings; 
use Data::Dumper; 
 
my ($key,$val);
my $database = 'file.txt'; 
my (%hash,$input); 
open(Dbase,"<$database") or die "can't open $database $!"; 
while (my $line = <Dbase>){ 
chomp $line;  
#$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
	#$reacts{$1} = $2;
my ($key,$val)=split('\s*:\s*',$line);
#$key =~ /([^\s]+)/;
$key = $1;
#$val =~ /(^\s+)(.+)(\s\.)/;
$val = $2;
$hash{$key}=$val; 
} 
close Dbase; 
#print Dumper \%hash; 
open (DATA,"+>data.txt") or die "Can't open data"; 
do { 
	print "Enter reaction name for searching:"; 
	$input=<>; 
	chomp($input); 
	if(exists $hash{$input}) { 
    print DATA "$input,$hash{$input}\n"; 
	} 
	else { 
    print "Given Key isn't found in the file\n"; 
	} 
}until($input=~/^\s*$/); 
close(DATA);

Hi Murtan,

Thanks for your help. I was able to follow the code snippet and the tester. What I have now (below) is still not working. I would be happy to recive further advice.

use strict; 
use warnings; 
 
my $database = 'file.txt'; 
my (%hash,$input); 
open(Dbase,"<$database") or die "can't open $database $!"; 
while (my $line = <Dbase>){ 
chomp $line;  
$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
	#$reacts{$1} = $2;
my ($key,$val)=split('\s*:\s*',$line);
#$key =~ /([^\s]+)/;
$key = $1;
#$val =~ /(^\s+)(.+)(\s\.)/;
$val = $2;
$hash{$key}=$val; 
} 
close Dbase; 

open (DATA,"+>data.txt") or die "Can't open data"; 
do { 
	print "Enter reaction name for searching:"; 
	$input=<>; 
	chomp($input); 
	if(exists $hash{$input}) { 
               print DATA "$input,$hash{$input}\n"; 
	} 
	else { 
               print "Given Key isn't found in the file\n"; 
	} 
}until($input=~/^\s*$/); 
close(DATA);

Did you try letting the dumper run?
What did you see?

The $1 and $2 in my code are from the regular expression.
They evaluate to the first and second 'groups' in the regex.

$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
# The above regular expression is decoded as:
#    Starting at the beginning of the line    ^
#    Match zero or more white space           \s*
#    Match one or more non-white space        (\S+)
#         saving it as group 1
#    Match zero or more white space           \s*
#    Match a colon                            :
#    Match zero or more white space           \s*
#    Match zero or more of any character      (.*)
#         saving it as group 2
#    Match zero or more white space           \s*
#    Match a period                           \.
$reacts{$1} = $2;
# This then uses the first saved group as the key and the
# second saved group as the value

You left the split in, but you're using the $1 and $2 from the regex.

Try this (modified from your last post):

use strict; 
use warnings; 
use Data::Dumper; 
 
my ($key,$val);
my $database = 'file.txt'; 
my (%hash,$input); 
open(Dbase,"<$database") or die "can't open $database $!"; 
while (my $line = <Dbase>){ 
chomp $line;  
$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
	#$reacts{$1} = $2;
#my ($key,$val)=split('\s*:\s*',$line);
#$key =~ /([^\s]+)/;
$key = $1;
#$val =~ /(^\s+)(.+)(\s\.)/;
$val = $2;
$hash{$key}=$val; 
} 
close Dbase; 
#print Dumper \%hash; 
open (DATA,"+>data.txt") or die "Can't open data"; 
do { 
	print "Enter reaction name for searching:"; 
	$input=<>; 
	chomp($input); 
	if(exists $hash{$input}) { 
    print DATA "$input,$hash{$input}\n"; 
	} 
	else { 
    print "Given Key isn't found in the file\n"; 
	} 
}until($input=~/^\s*$/); 
close(DATA);

Hi,

when I allowed the dumper to run, I got the folowin:
$VAR1 = {
'C2_GLH3' => 'GLUC + ATP = GLUC6P + ADP ',
'A1_HTTT24' => 'GLUC_ext = GLUC ',
'B3_PGAI1' => 'GLUC6P = FRUC6P '
};
Enter reaction name for searching:

But still, the output file, data,txt was empty.

Thanks

Did you try letting the dumper run?
What did you see?

The $1 and $2 in my code are from the regular expression.
They evaluate to the first and second 'groups' in the regex.

$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
# The above regular expression is decoded as:
#    Starting at the beginning of the line    ^
#    Match zero or more white space           \s*
#    Match one or more non-white space        (\S+)
#         saving it as group 1
#    Match zero or more white space           \s*
#    Match a colon                            :
#    Match zero or more white space           \s*
#    Match zero or more of any character      (.*)
#         saving it as group 2
#    Match zero or more white space           \s*
#    Match a period                           \.
$reacts{$1} = $2;
# This then uses the first saved group as the key and the
# second saved group as the value

You left the split in, but you're using the $1 and $2 from the regex.

Try this (modified from your last post):

use strict; 
use warnings; 
use Data::Dumper; 
 
my ($key,$val);
my $database = 'file.txt'; 
my (%hash,$input); 
open(Dbase,"<$database") or die "can't open $database $!"; 
while (my $line = <Dbase>){ 
chomp $line;  
$line =~ m/^\s*(\S+)\s*:\s*(.*)\s*\./;
	#$reacts{$1} = $2;
#my ($key,$val)=split('\s*:\s*',$line);
#$key =~ /([^\s]+)/;
$key = $1;
#$val =~ /(^\s+)(.+)(\s\.)/;
$val = $2;
$hash{$key}=$val; 
} 
close Dbase; 
#print Dumper \%hash; 
open (DATA,"+>data.txt") or die "Can't open data"; 
do { 
	print "Enter reaction name for searching:"; 
	$input=<>; 
	chomp($input); 
	if(exists $hash{$input}) { 
    print DATA "$input,$hash{$input}\n"; 
	} 
	else { 
    print "Given Key isn't found in the file\n"; 
	} 
}until($input=~/^\s*$/); 
close(DATA);

I don't get it...I modified your file only slightly (below) and the run looks like this:

$VAR1 = {
          'C2_GLH3' => 'GLUC + ATP = GLUC6P + ADP',
          'A1_HTTT24' => 'GLUC_ext = GLUC',
          'B3_PGAI1' => 'GLUC6P = FRUC6P'
        };
Enter reaction name for searching:C2_GLH
Can't find 'C2_GLH' in the file
Enter reaction name for searching:C2_GLH3
C2_GLH3,GLUC + ATP = GLUC6P + ADP
Enter reaction name for searching:APPLE
Can't find 'APPLE' in the file
Enter reaction name for searching:B3_PGAI1
B3_PGAI1,GLUC6P = FRUC6P
Enter reaction name for searching:

and the output file contains this:

C2_GLH3,GLUC + ATP = GLUC6P + ADP
B3_PGAI1,GLUC6P = FRUC6P

Source:

use strict; 
use warnings; 
use Data::Dumper; 

my %reacts;
my $database = "reaction_database.txt";
open(Dbase,"<$database") or die "can't open $database $!";

# Simplified read loop
# Sample input data
#A1_HTTT24 : GLUC_ext = GLUC .
while (my $line = <Dbase>) {
	# optional whitespace, KEY, optional whitespace, required ':', optional whitespace, VALUE, required whitespace, required '.'
	$line =~ m/^\s*(\S+)\s*:\s*(.*)\s+\./;
	$reacts{$1} = $2;
}

close(Dbase);

#debug to confirm file contents -- left in so I could 'see' something to type
print Dumper \%reacts;

my $input;
open (DATA,"+>react_out.txt") or die "Can't open data"; 

do { 
	print "Enter reaction name for searching:"; 
	$input=<>; 
	chomp($input); 
	if (not $input=~/^\s*$/) {
		if(exists $reacts{$input}) { 
			# feedback on what was matched
			print "$input,$reacts{$input}\n"; 
			print DATA "$input,$reacts{$input}\n"; 
		} 
		else { 
			# explicit feedback for what was not found
			print "Can't find '$input' in the file\n"; 
		} 
	}
}until($input=~/^\s*$/); 

close(DATA);

I tried Murtan's version (posted yesterday March 8) of the script. Here's how it looked when I ran it in Terminal:

david@david-laptop:~$ cd /home/david/Programming/Perl
david@david-laptop:~/Programming/Perl$ perl SearchFile.pl
Enter reaction name for searching:A1_HTTT24
Enter reaction name for searching:B3_PGAI1
Enter reaction name for searching:
Given Key isn't found in the file
david@david-laptop:~/Programming/Perl$

This created the data.txt file and here are the contents:

A1_HTTT24,GLUC_ext = GLUC 
B3_PGAI1,GLUC6P = FRUC6P

I don't know why the script wouldn't work for you, Perly. Are you sure you entered the keys correctly during the test? It has to be upper case because 'A1_HTTT24' does not equal 'a1_httt24'.

Edited 6 Years Ago by d5e5: Murtan posted again while I was writing this. By 'latest' I meant 'as of yesterday'.

Many thanks, Murtan. The latest one is working as I got the correct output on CMD, but it's still a mystery that from my end it's not writing into the file.

I don't get it...I modified your file only slightly (below) and the run looks like this:

$VAR1 = {
          'C2_GLH3' => 'GLUC + ATP = GLUC6P + ADP',
          'A1_HTTT24' => 'GLUC_ext = GLUC',
          'B3_PGAI1' => 'GLUC6P = FRUC6P'
        };
Enter reaction name for searching:C2_GLH
Can't find 'C2_GLH' in the file
Enter reaction name for searching:C2_GLH3
C2_GLH3,GLUC + ATP = GLUC6P + ADP
Enter reaction name for searching:APPLE
Can't find 'APPLE' in the file
Enter reaction name for searching:B3_PGAI1
B3_PGAI1,GLUC6P = FRUC6P
Enter reaction name for searching:

and the output file contains this:

C2_GLH3,GLUC + ATP = GLUC6P + ADP
B3_PGAI1,GLUC6P = FRUC6P

Source:

use strict; 
use warnings; 
use Data::Dumper; 

my %reacts;
my $database = "reaction_database.txt";
open(Dbase,"<$database") or die "can't open $database $!";

# Simplified read loop
# Sample input data
#A1_HTTT24 : GLUC_ext = GLUC .
while (my $line = <Dbase>) {
	# optional whitespace, KEY, optional whitespace, required ':', optional whitespace, VALUE, required whitespace, required '.'
	$line =~ m/^\s*(\S+)\s*:\s*(.*)\s+\./;
	$reacts{$1} = $2;
}

close(Dbase);

#debug to confirm file contents -- left in so I could 'see' something to type
print Dumper \%reacts;

my $input;
open (DATA,"+>react_out.txt") or die "Can't open data"; 

do { 
	print "Enter reaction name for searching:"; 
	$input=<>; 
	chomp($input); 
	if (not $input=~/^\s*$/) {
		if(exists $reacts{$input}) { 
			# feedback on what was matched
			print "$input,$reacts{$input}\n"; 
			print DATA "$input,$reacts{$input}\n"; 
		} 
		else { 
			# explicit feedback for what was not found
			print "Can't find '$input' in the file\n"; 
		} 
	}
}until($input=~/^\s*$/); 

close(DATA);

Hi d5e5, it works for me finally except that writing into the file is still troublesome. I can't understand why this is so as it works for you guys both by writing into the file and onto the cmd.
Thanks

I tried Murtan's version (posted yesterday March 8) of the script. Here's how it looked when I ran it in Terminal:

david@david-laptop:~$ cd /home/david/Programming/Perl
david@david-laptop:~/Programming/Perl$ perl SearchFile.pl
Enter reaction name for searching:A1_HTTT24
Enter reaction name for searching:B3_PGAI1
Enter reaction name for searching:
Given Key isn't found in the file
david@david-laptop:~/Programming/Perl$

This created the data.txt file and here are the contents:

A1_HTTT24,GLUC_ext = GLUC 
B3_PGAI1,GLUC6P = FRUC6P

I don't know why the script wouldn't work for you, Perly. Are you sure you entered the keys correctly during the test? It has to be upper case because 'A1_HTTT24' does not equal 'a1_httt24'.

Hi d5e5, it works for me finally except that writing into the file is still troublesome. I can't understand why this is so as it works for you guys both by writing into the file and onto the cmd.
Thanks

Does it work if you remove the '+' from the statement opening the DATA file? Try open (DATA,">react_out.txt") or die "Can't open data"; instead of open (DATA,"+>react_out.txt") or die "Can't open data"; As I said, the original worked for me without removing the '+' but you may have a different version of Perl or a different operating system. I've never used the +>filename before. It supposedly lets you read and write to the file, instead of just reading, but your script does not read from the DATA file anyway, so you don't need the '+'. I don't know if this will help... just guessing.

Hi d5e5,

It doesn't work when I remove the '+', but for some reason it woorks when I comment out the do-until loop - maybe OS differences as you suggested or even perl version; I use perl 5.8.3.

Many thanks to you and Murtan for your help and useful suggestions!

Edited 6 Years Ago by perly: n/a

This question has already been answered. Start a new discussion instead.