Hi all,
I am new to perl I want to compare two files
File 1 is having protein names like

Biotin-[acetyl-CoA-carboxylase] ligase
Acetyl-/propionyl-coenzyme A carboxylase alpha chain
Acetyl-CoA carboxylase, carboxyl transferase, beta subunit
biotin carboxylase-like protein
mycolic acid synthase
long-chain-fatty-acid-CoA ligase
3-oxoacyl-[acyl-carrier-protein] synthase I
3-oxoacyl-[acyl-carrier-protein] synthase II
3-oxoacyl-[acyl-carrier-protein] synthase III

file 2 is having things like

Location Strand Length PID Gene Synonym Code COG Product
33..1529 + 498 118466558 dnaA MAV_0001 - COG0593L chromosomal replication initiation protein
2150..3298 + 382 118465865 dnaN MAV_0002 - COG0592L DNA polymerase III subunit beta
3299..4456 + 385 118465169 recF MAV_0003 - COG1195L recombination protein F
4453..4998 + 181 118464189 - MAV_0004 - COG5512R hypothetical protein
5257..7290 + 677 118463778 gyrB MAV_0005 - COG0187L DNA gyrase subunit B
7302..9821 + 839 118462773 gyrA MAV_0006 - COG0188L DNA gyrase subunit A

and want to match both the files means whether the names of file 1 are there in file2 or not


This program prints out whether or not each line in the file of protein names occurs in the data file.

#!/usr/bin/perl -w
use strict;
my ($names, $data) = ("ProteinNames.txt", "ProteinData.txt");
open (FILE1, $names) || die;
open (FILE2, $data) || die;
undef $/; #Enter "file-slurp mode" by emptying variable indicating end-of-record
my $string = <FILE2>; #Read entire file to be searched into a string variable
$/ = "\n"; #Restore default value to end-of-record variable

while (<FILE1>) {
    chomp; #remove new-line character from end of $_
    #Use quotemeta() to fix characters that could spoil syntax in search pattern
    my $qmname = quotemeta($_);
    if ($string =~m/$qmname/i) {
        print "***FOUND*** $_ in $data.\n";
    else {
        print "***NOTFOUND*** $_ in $data\n";
This article has been dead for over six months. Start a new discussion instead.