1.11M Members

File Comparision in perl

 
0
 

I have two files called 1.txt and 2.txt

I want to just compare each line of 1.txt with all the lines in 2.txt. and if any of the line of 1.txt is not available in 2.txt those lines need to printed on console. It is ok even if 2.txt contains data which is not available in 1.txt.

 
0
 

Hi bharat, try this one

use 5.012;
my $file1=$ARGV[0];
my $file2=$ARGV[1];
my (%file1, %file2);
open (my $in, '<', $file1) or die "cannot open the input file $file1 $!";

while (<$in>){
    $file1{$.}=$_;
}
close $in;

open (my $in1, '<', $file2) or die "cannot open the input file $file1 $!";

while (<$in1>){
    $file2{$_}=$.;
}
close $in1;
foreach (values %file1){

    print "missing line  $_" unless (exists $file2{$_});

}
exit;
 
0
 
foreach (values %file1) {
  print "missing line $_" unless (exists $file2{$_});
}

If there are duplicated lines in file2, I am not sure it would print all the duplicated line numbers? Despite, its functionality may already satisfy what the OP wants.

 
0
 

Hi Taywin,

If there are duplicated lines in file2, I am not sure it would print all the duplicated line numbers? Despite, its functionality may already satisfy what the OP wants.

Nice observation, and that was my initial thought too until I read again the OP's last comments * It is ok even if 2.txt contains data which is not available in 1.txt.*.

So, I think yuvanbala has the OP question solved.

However, If I may raise some suggestions in yuvanbala's solution.

  1. Though the files were "open"ed they are not properly "close"ed. One can use autodie or write out the check explicitly, or put the open file in a subroutrine and there will be no need to close the filehandles since a "scoped" filehandle is closed when the subroutine closes.

  2. Why not just write one subroutine that takes care of the open function for the two files instead of reapeating oneself... Infact, one should not DRY (Don't Repeat Yourself), You would be happy you don't.

  3. Use different names for your variables. Similar names could confuse you or the people that would work on the script later but not perl though

If I may modify yuvanbala's solution:

use 5.012;
my ( $file1, $file2 ) = @ARGV;
my ( %file1_container, %file2_container );

my %compare_file = (
    $file1 => sub { $file1_container{$.} = $_[0];},
    $file2 => sub { $file2_container{ $_[0] } = $. },
);

open_and_read( \%compare_file );

foreach ( sort {$a <=> $b} values %file1_container ) {
    print "Missing line: ", $_, $/ unless exists $file2_container{$_};
}

sub open_and_read {
    my ($file_n_op) = @_;

    for my $filename ( keys %{$file_n_op} ) {
        open my $fh, '<', $filename or die "can't open filename: $!";
        while (<$fh>) {
            chomp;
            $file_n_op->{$filename}->($_);
        }
    }
}
You
This article has been dead for over six months: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article