Hi,

I have two files:

File1 (tab-delimited and two columns):
Ex_efxb 0.0023
MSeef 2.3000
F_ecjc 0.3338
MWEEI -0.111
DDAIij 17.777

File2:
MSeef 2.3000
F_ecjc 0.3338

I want to search the content of File one using the content of File 2 and then display the output as follows:

Date of search:
The following matches were found in File 1:

MSeef 2.3000
F_ecjc 0.3338

Here's is my perl script which did not work for the above tasks:

#!C:\bin\perl.exe 
 
 
my $REPORT_FILE  = 'outFile.txt'; 
 
$F1 = 'File1.txt'; 
open(RF,"<$F1") || die "can't open $F1 $!"; 
 
 
$F2 = 'File2.txt'; 
open(RXNs,"<$F2") || die "can't open $F2 $!"; 
close F1; 
close F2; 
 
 
my $line = <RF>; 
@f1 = split /\t/, $line; 
 
my $line = <RXN>; 
@f2 = $line; 
 
  
 
open(DATA,"+>OutFile.txt") or die "Can't open data"; 
 
foreach $a (@f1){ 
        $flag = 0; 
	foreach $b (@f2){ 
			if ($a eq $b){ 
			    print DATA $line{1}."\t".$line{2}."\n" ; 
			    $flag = 1; 
			    last; 
		} 
		if ($flag==0){ 
			print DATA ""; 
		} 
	} 
	 
} 
close DATA;

Please help make this script work!

Recommended Answers

All 18 Replies

What do you mean, "did not work"?

The output file was empty.

Hi,

I have two files:

File1 (tab-delimited and two columns):
Ex_efxb 0.0023
MSeef 2.3000
F_ecjc 0.3338
MWEEI -0.111
DDAIij 17.777

File2:
MSeef 2.3000
F_ecjc 0.3338

I want to search the content of File one using the content of File 2 and then display the output as follows:

Date of search:
The following matches were found in File 1:

MSeef 2.3000
F_ecjc 0.3338

Here's is my perl script which did not work for the above tasks:

#!C:\bin\perl.exe 
 
 
my $REPORT_FILE  = 'outFile.txt'; 
 
$F1 = 'File1.txt'; 
open(RF,"<$F1") || die "can't open $F1 $!"; 
 
 
$F2 = 'File2.txt'; 
open(RXNs,"<$F2") || die "can't open $F2 $!"; 
close F1; 
close F2; 
 
 
my $line = <RF>; 
@f1 = split /\t/, $line; 
 
my $line = <RXN>; 
@f2 = $line; 
 
  
 
open(DATA,"+>OutFile.txt") or die "Can't open data"; 
 
foreach $a (@f1){ 
        $flag = 0; 
	foreach $b (@f2){ 
			if ($a eq $b){ 
			    print DATA $line{1}."\t".$line{2}."\n" ; 
			    $flag = 1; 
			    last; 
		} 
		if ($flag==0){ 
			print DATA ""; 
		} 
	} 
	 
} 
close DATA;

Please help make this script work!

Lines 11 & 19 - Handles are different

I'm not sure what lines 12 & 13 are supposed to achieve.

Line 16 is just reading the first line from the file, as is line 20. If you want to read the complete file into an array then you need to use @array=<FH>; and don't forget to do chomp(@array) if you need to remove the trailing line feeds.

If you do these you should start the compare loop with both arrays containing the file contents, 1 line to each array element.

I'll leave it to you the check if the comparison works. let us know how you get on.

This the latest script, with 2 errors:

#!C:\bin\perl.exe
use warnings; 
use strict;

my $REPORT_FILE  = 'outFile.txt';  
  
my $F1 = 'File1.txt';  
open(RF,"<$F1") || die "can't open $F1 $!";  
  

my $F2 = 'File2.txt';  
open(RXNs,"<$F2") || die "can't open $F2 $!";  


my %line;   
my %var1;
my %var2;

while (my $line = <RF>){
			$line = split('\t');
			$var1{$1} = {2};
}
close(RF);

while (my $line = <RXNs>){
			$line = split('\n');
			$var1{$2}={1};
}
close(RXNs);
			
  

open(DATA,"+>OutFile.txt") or die "Can't open data";  
  
 

if (exists $var1{$var2}){  
			    
			    print DATA $var1{1}."\t".$var2{2}."\n" ;  
			    
		}  
		else {
		
			print "$var2 not found in the file\n";  
}  
	
close DATA;

Errors:
Global symbol "$var2" requires explicit package name at testTwoFiles.pl line 37.

Global symbol "$var2" requires explicit package name at testTwoFiles.pl line 44.

testTwoFiles.pl had compilation errors.

First error
You've declared %var2 but not $var2.

Second error
$var2 is undeclared but is being interpolated into the string. If you want to use
the variable then declare, else use single quotes instead of double.

Apologies, File2 should be as follows:

File2:

MSeef
F_ecjc

I have ammended the script and still there is no output. Sorry about the changes:

#!/usr/bin/perl

use strict;
use warnings;

#open File2.txt
open(Dbase, '<', 'File2.txt') or die "Can't File2.txt";

while (my $line = <Dbase>) {
 
     		my %seen; 
		while (my $line = <Dbase>) {  
   			chomp $line; 
   			$seen{$line} = 1; 
			}
		}
close(Dbase);

#open output file
open(DATA, '>', 'OutFile.txt') or die "Can't open data";

#Open File1.txt
open(RXN, '<', 'File1.txt') or die "can't open File1.txt";

#search File1 for the contents of File2
while (my $line = <RXN>) { 

    		my %line;
    		my %seen;
    		chomp $line; 
		$line{$1} = {2};
    		
    		if ( exists $seen{$line} ) { 
        		print "Found it! $line\n"; 
    } 
}

close(RXN); 
close(DATA);

First error
You've declared %var2 but not $var2.

Second error
$var2 is undeclared but is being interpolated into the string. If you want to use
the variable then declare, else use single quotes instead of double.

In lines 11 and 28 the definitions of %seen are in different scopes - they are to distinct variables and they are re-initialized in each iteration of the surrounding loop.

When you enter the while loop at line 26 you should have a hash, %seen, containing all the keys that you are looking for.

In this loop you need to

read a line
split it to get the key
check if this key exists in the %seen and if so report it.

This is what I have now - though still no output:

#!/usr/bin/perl

use strict;
use warnings;

my %seen;

open(Dbase, '<', 'File2.txt') or die "Can't File2.txt";

while (my $line = <Dbase>) {
 
     		my %seen; 
		while (my $line = <Dbase>) {  
   			chomp $line; 
   			$seen{$line} = 1; 
			}
		}
close(Dbase);

open(DATA, '>', 'OutFile.txt') or die "Can't open data";


open(RXN, '<', 'File1.txt') or die "can't open File1.txt";


while (my $line = <RXN>) { 

    		my %line;
    		my %seen;
    		chomp $line; 
    		$line = split('\t');
		$line{$1} = {2};
    		
    		if ( exists $seen{$line} ) { 
        		print "Found it! $line\n"; 
    } 
}

close(RXN); 
close(DATA);

In lines 11 and 28 the definitions of %seen are in different scopes - they are to distinct variables and they are re-initialized in each iteration of the surrounding loop.

When you enter the while loop at line 26 you should have a hash, %seen, containing all the keys that you are looking for.

In this loop you need to

read a line
split it to get the key
check if this key exists in the %seen and if so report it.

You don't need %seen to be declared at line 12 or at line 29.
Why have you got one while loop inside another at lines 10 to 17?
Line 31 sets $line to the number of fields found - not what you want - use split('\t', $line) instead. This will return an array containing all of the fields.
I don't understand what line 32 is supposed to be doing.

You're nearly there.

I have adjusted the code according to your corrections and the output file is still empty.

with this line - $line{$1} = {2}, I meant to 'equate' contents of column 1 to their corresponding values in column2 in file1, since I'm only searching with file 2 values to be found in column 1 of file1, but the output must have the two columns.

#!/usr/bin/perl

use strict;
use warnings;

my %seen;

open(Dbase, '<', 'File2.txt') or die "Can't File2.txt";

while (my $line = <Dbase>) {
 
     		chomp $line; 
   		$seen{$line} = 1; 
   		}
close(Dbase);

open(DATA, '>', 'OutFile.txt') or die "Can't open data";


open(RXN, '<', 'File1.txt') or die "can't open File1.txt";


while (my $line = <RXN>) { 

    		my %line;
    		chomp $line; 
    		$line = split('\t',$line);
		#$line{$1} = {2};
    		
    		if ( exists $seen{$line} ) { 
        		print "Found it! $line\n"; 
    } 
}

close(RXN); 
close(DATA);

You don't need %seen to be declared at line 12 or at line 29.
Why have you got one while loop inside another at lines 10 to 17?
Line 31 sets $line to the number of fields found - not what you want - use split('\t', $line) instead. This will return an array containing all of the fields.
I don't understand what line 32 is supposed to be doing.

You're nearly there.

What I hope is the final fix for you.

Change line 27 to

($line, undef) = split('\t',$line);

This is splitting $line into a two element array, one of which is $line, the other of which we don't want.

Thanks for the line of code.

Somehow it still refuses to output anythin.

Latest script:

#!/usr/bin/perl

use strict;
use warnings;

my %seen;

open(Dbase, '<', 'File2.txt') or die "Can't File2.txt";

while (my $line = <Dbase>) {
 
     		chomp $line; 
   		$seen{$line} = 1; 
   		}
close(Dbase);

open(DATA, '>', 'OutFile.txt') or die "Can't open data";


open(RXN, '<', 'File1.txt') or die "can't open File1.txt";


while (my $line = <RXN>) { 

    		my %line;
    		chomp $line; 
    		($line, undef) = split('\t',$line);
    		#$line{$1} = {2};
    		
    		if ( exists $seen{$line} ) { 
        		print "Found it! $line\n"; 
    } 
}

close(RXN); 
close(DATA);

What I hope is the final fix for you.

Change line 27 to

($line, undef) = split('\t',$line);

This is splitting $line into a two element array, one of which is $line, the other of which we don't want.

This works fine for me. Are you sure that there are no trailing spaces in File2.txt?

Try running this -

#!/usr/bin/perl
use Data::Dumper;

use strict;
use warnings;

my %seen;

open(Dbase, '<', 'File2.txt') or die "Can't File2.txt";

while (my $line = <Dbase>) {

                chomp $line;
                $seen{$line} = 1;
                }
close(Dbase);

print Dumper(\%seen);

open(DATA, '>', 'OutFile.txt') or die "Can't open data";


open(RXN, '<', 'File1.txt') or die "can't open File1.txt";


while (my $line = <RXN>) {

                my %line;
                chomp $line;
                ($line, undef) = split('\t',$line);
                #$line{$1} = {2};

                if ( exists $seen{$line} ) {
                        print "Found it! $line\n";
    }
}

close(RXN);
close(DATA);

It should give this output -

$VAR1 = {
          'MSeef' => 1,
          'F_ecjc' => 1
        };
Found it! MSeef
Found it! F_ecjc

$VAR1 is the contents of %seen.

I got the same output:

$VAR1 = {
          'MSeef' => 1,
          'F_ecjc' => 1
        };

I have also tried deleting the spaces at the end of Files1 and 2. But still no output. this is starnge! It's possibly due to my PC set up. I run the scropt thsi way: perl script.pl

This works fine for me. Are you sure that there are no trailing spaces in File2.txt?

Try running this -

#!/usr/bin/perl
use Data::Dumper;

use strict;
use warnings;

my %seen;

open(Dbase, '<', 'File2.txt') or die "Can't File2.txt";

while (my $line = <Dbase>) {

                chomp $line;
                $seen{$line} = 1;
                }
close(Dbase);

print Dumper(\%seen);

open(DATA, '>', 'OutFile.txt') or die "Can't open data";


open(RXN, '<', 'File1.txt') or die "can't open File1.txt";


while (my $line = <RXN>) {

                my %line;
                chomp $line;
                ($line, undef) = split('\t',$line);
                #$line{$1} = {2};

                if ( exists $seen{$line} ) {
                        print "Found it! $line\n";
    }
}

close(RXN);
close(DATA);

It should give this output -

$VAR1 = {
          'MSeef' => 1,
          'F_ecjc' => 1
        };
Found it! MSeef
Found it! F_ecjc

$VAR1 is the contents of %seen.

What operating system are you using?

Windows XP

What operating system are you using?

Ah, I was misled by your shebang - what perl are you using? This is with Strawberry perl on XP

Volume Serial Number is 8CFD-0E63

 Directory of C:\Temp\perl

29/03/2011  00:09    <DIR>          .
29/03/2011  00:09    <DIR>          ..
26/03/2011  12:10                74 File1.txt
26/03/2011  12:10                29 File2.txt
27/03/2011  23:30                15 File3.txt
29/03/2011  00:09                 0 OutFile.txt
29/03/2011  01:08               658 What_5.pl
               5 File(s)            776 bytes
               2 Dir(s)   2,923,069,440 bytes free

C:\Temp\perl>What_5.pl
$VAR1 = {
          'MSeef' => 1,
          'F_ecjc' => 1
        };
Found it! MSeef
Found it! F_ecjc

C:\Temp\perl>perl What_5.pl
$VAR1 = {
          'MSeef' => 1,
          'F_ecjc' => 1
        };
Found it! MSeef
Found it! F_ecjc

C:\Temp\perl>

Here's the output:

>perl -version

This is perl 5, version 12, subversion 2 (v5.12.2) built for MSWin32-x86-multi-t
hread
(with 8 registered patches, see perl -V for more detail)

Copyright 1987-2010, Larry Wall

Binary build 1203 [294165] provided by ActiveState http://www.ActiveState.com
Built Dec  9 2010 04:03:28

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl".  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

Ah, I was misled by your shebang - what perl are you using? This is with Strawberry perl on XP

Volume Serial Number is 8CFD-0E63

 Directory of C:\Temp\perl

29/03/2011  00:09    <DIR>          .
29/03/2011  00:09    <DIR>          ..
26/03/2011  12:10                74 File1.txt
26/03/2011  12:10                29 File2.txt
27/03/2011  23:30                15 File3.txt
29/03/2011  00:09                 0 OutFile.txt
29/03/2011  01:08               658 What_5.pl
               5 File(s)            776 bytes
               2 Dir(s)   2,923,069,440 bytes free

C:\Temp\perl>What_5.pl
$VAR1 = {
          'MSeef' => 1,
          'F_ecjc' => 1
        };
Found it! MSeef
Found it! F_ecjc

C:\Temp\perl>perl What_5.pl
$VAR1 = {
          'MSeef' => 1,
          'F_ecjc' => 1
        };
Found it! MSeef
Found it! F_ecjc

C:\Temp\perl>
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.