Remove specific whitespace from a text &Compare 4 files and output result

Question

erezz 0 Newbie Poster

11 Years Ago

Hi,
I have 4 different files that contain the text below.
I need to Remove specific whitespace from the text and to find if there is a number that  appears in more then one file and to print the output result to a new file that will includ lines with the following output: 
1- The number
2- Letters thet appears next to the number
3- The name of the file that out put was taken from.

Example:
The text look like this in the files:
file 1 named PT1.txt:
IBN         LG   25 1 03 08       077 437 1234      CPB         GWLPOT
IBN         LG   25 1 03 08       077 437 1111      IDL           GWLPOT
file 2 named PT2.txt:
IBN         LG   25 1 03 08       077 437 1113      PLO         GWLPOT
IBN         LG   25 1 03 08       077 437 2738      SB           GWLPOT
IBN         LG   25 1 03 08       077 411 2238      MB           GWLPOT
file 3 named YK1.txt:
IBN         LG   25 1 03 08       077 437 1113      SB           GWLPOT
IBN         LG   25 1 03 08       077 411 2738      SB           GWLPOT
IBN         LG   25 1 03 08       077 411 2338      MB           GWLPOT

file 4 nemed YK2.txt
IBN         LG   25 1 03 08       077 437 1113      PLO         GWLPOT
IBN         LG   25 1 03 08       077 437 2738      SB           GWLPOT
IBN         LG   25 1 03 08       077 437 2738      MB           GWLPOT

The out put that I need to get is:
0774371113      SB | YK1.txt
0774371113      PLO | YK2.txt
I wrote a script for one file that delete some info from the lines and the out put look like this:
077 437 2738      CPB
077 437 2738      CPB

The script look like this:
#!/usr/bin/perl   
$file = "PT1.txt";   
open (IN, $file) || die "Cannot open file ".$file." for read";        
@lines=<IN>;     
open (OUT, ">", $file) || die "Cannot open file ".$file." for write"; 
foreach $line (@lines)   
{

$line =~ s/\s\d{2}\s | \s\d{2}\s{6}//ig;
$line =~ s/\s\d{2}\s | \s\d{2}\s{6}//ig;
$line =~ s/\s\d{1}\s | \s\d{2}\s{2}//ig;
$line =~ s/\s\d{2}\s//ig;
$line =~ s/IBN|LG|GWLPOT//ig;
$line =~ s/\s{11}//ig;


   print OUT $line;     
}     
close OUT; 

please advice me with this issue,
Thank's in advance.

perl

3 Contributors
3 Replies
224 Views
1 Day Discussion Span
Latest Post 11 Years Ago Latest Post by erezz

All 3 Replies

2teez 43 Posting Whiz

11 Years Ago

This does all that you want, I believe (Please take it also as a guide in the right direction).
Also note that the final output is to a new file called new_file.txt or any other name you might give it.

#!/usr/bin/perl
use warnings;
use strict;

my @files = qw(PT1.txt PT2.txt YK1.txt YK2.txt);
my %has_data;
my %sorter;

foreach my $file (@files) {
open my $fh, '<', $file or die "can't open $file:$!";
while (<$fh>) {
    chomp;
    my @rec = split;
    my $num = join "", @rec[ 6 .. 8 ];
    ++$sorter{$num};
    push @{ $has_data{$num} }, "$rec[9] | $file";
}
close $fh or die "can't close file:$!";
}

foreach ( keys %sorter ) {
    delete $has_data{$_} if $sorter{$_} == 1;
}

open my $fh, '>', 'new_file.txt' or die "can't open file:$!"; # Output 
foreach my $number ( keys %has_data ) {
    foreach ( @{ $has_data{$number} } ) {
        print $fh $number, "t", $_, $/;
    }
}
close $fh or die "can't close file:$!";

Just run this script from your CLI. But note that all your file namely PT1.txt,PT2.txt,YK1.txt and YK2.txt must be in the same directory with your script (though this is not cast in stone you can modify as you so wish)
Hope this help!

MY OUTPUT (new_file.txt)

0774371113  PLO | PT2.txt
0774371113  SB | YK1.txt
0774371113  PLO | YK2.txt
0774372738  SB | PT2.txt
0774372738  SB | YK2.txt
0774372738  MB | YK2.txt

Edited 11 Years Ago by 2teez

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

d5e5 109 Master Poster · Answer 1 · 2012-05-07T15:41:55+00:00

To answer the first part of your question about removing spaces from the number and reading from four input files:

#!/usr/bin/perl
use strict;
use warnings;

@ARGV = qw(PT1.txt PT2.txt YK1.txt YK2.txt);

while (my $line = <>){
    my @fields = split /\s+/, $line;
    print @fields[6..8], "\t", $fields[9], ' | ', $ARGV, "\n";
}

erezz 0 Newbie Poster · Answer 2 · 2012-05-09T02:44:05+00:00

Thank you both very much for your help,time and for your quick and efficient answer.

Remove specific whitespace from a text &Compare 4 files and output result

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers