Compare two text files & output in second file in Perl scripting

Question

debayanenator 0 Newbie Poster

9 Years Ago

Hello, Can you please help me with the following scenario in Perl scripting? I want to compare two text files and save output of this comparision in third file with flag PP. So basically, the word common in both the text files should be printed with a flag |PP on the second text file.

File1.txt -
abc
efg
xyz

File2.txt
abc
efh
pqr

Expected output is - File2.txt
abc |PP
efg
xyz
efh
pqr

perl

2 Contributors
7 Replies
971 Views
1 Week Discussion Span
Latest Post 9 Years Ago Latest Post by debayanenator

All 7 Replies

2teez 43 Posting Whiz

9 Years Ago

Hi debayanenator,

One would have love to see the bunch of errors that you are having.
However, the routes you took in solving the problem will only give you hash entry that has only one as the value.

What you could do is use an hash to get the data in file 1, then open and read through file 2, and while you are reading, check if the hash that contain data from file1, has a corresponding value in file2. If yes, update the key of the hash, if not, make a new entry in the hash.

lastly, sort your hash and print out your keys. You should have your solution.

Below is an example using the values the OP gave. Please note that I use a module called Inline::Files, instead of opening files. This code works perfectly, but OP will have to use open function and print the output as so desire.

use warnings;
use strict;
use Inline::Files;
use Data::Dumper;

my $result = {};

# read from a first file
while(<FILE1>) {
    chomp;
    next if /^$/; # get the next line if empty line
    $result->{$_}++;    
}

# read the second file
while(<FILE2>) {
    chomp;
    next if /^$/;

    # if line is same with that 
    # an hash key in file 1
    if ($result->{$_}) {

    # delete the entry
        delete $result->{$_};

    # then join line with the 
    # word PP and create a new entry
    $result->{join " |"=> $_,"PP"}++;
    }
    else {
        $result->{$_}++;
    }
}

# sort the hash keys from the two files
# and out put your result.
{
    $Data::Dumper::Sortkeys = 1;
    print Dumper $result;
}


__FILE1__
abc
efg
xyz

__FILE2__
abc
efh
pqr

**Output: **

$VAR1 = {
          'abc |PP' => 1,
          'efg' => 1,
          'efh' => 1,
          'pqr' => 1,
          'xyz' => 1
        };

2teez 43 Posting Whiz

9 Years Ago

Hi,
To use the script like you have it, you have to install the module from CPAN. But really you don't have to.
Just use open function, and read your files using a while loop, instead of using Inline::FIles

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

debayanenator 0 Newbie Poster · Answer 1 · 2016-06-07T06:32:28+00:00

I have tried using the following code and end up getting bunch of errors , please advise

use strict;
use warnings;
use autodie;

my $f1 = shift || "File 1.txt";
my $f2 = shift || "File 2.txt";
my %results;
open my $file1, '<', $f1;
while (my $line = <$file1>) { $results{$line} = 1 }
open my $file2, '<', $f2;
while (my $line = <$file2>) { $results{$line}++ }
foreach my $line (sort { $results{$b} <=> $results{$a} } keys %results) {
    print "$results{$line}: \t Match", $line if $results{$line} > 1;
}

debayanenator 0 Newbie Poster · Answer 2 · 2016-06-10T08:26:11+00:00

Hi 2teez ,

Thanyou for you prompt reply. Really appreciate it.
I am getting an error while executing the code and it says

"Can't locate Inline/Files.pm in @INC (@INC contains : C:/Perl/site/Lib)
compilation aborted".

Please advise
Do i need to install the module separately ?

debayanenator 0 Newbie Poster · Answer 3 · 2016-06-12T07:37:12+00:00

That worked !!! Thanks a lot , just that I made few modifications to get the desired output and using open function instead of Inline :: Files.
However , I have one question. I really dont want my output to show the number of occurences of the same word , is there a way to get rid of it ?
The output that the program gives .

$VAR1 = {
          'abc |PP' => 1,
          'efg' => 1,
          'efh' => 1,
          'pqr' => 1,
          'xyz' => 1
        };

Desired Output :

$VAR1 = {
          'abc |PP'
          'efg' 
          'efh' 
          'pqr' 
          'xyz'
        };

Just dont need this ' => Number of occurences '. Any ideas ?

2teez 43 Posting Whiz · Answer 4 · 2016-06-14T18:29:32+00:00

Hi debayanenator,

Sorry, this is coming a bit late.

You can use a for loop with the keys of the hash, which contain the data you wanted. Like so:

print $_, $/ for (sort{$a cmp $b} keys %{$result});

It should give you the result you wanted.

debayanenator 0 Newbie Poster · Answer 5 · 2016-06-18T08:09:52+00:00

debayanenator 0 Newbie Poster

9 Years Ago

Thanks a Ton :) U saved my day !!

Compare two text files & output in second file in Perl scripting

Recommended Answers Collapse Answers

All 7 Replies

Recommended Answers