Hi,

I need to print lines if it appears more than 3 times in a file looking only at the second column.

I have written the below code. But it will look for the complete words in a line.

#!/usr/bin/perl

use strict;
use warnings;

my $file = 'file.txt';
my %seen = ();
{
  local @ARGV = ($file);
  while(<>){
         $seen{$_}++;
     next if $seen{$_} > 1;
     print;
  }
}
foreach my $keys  ( sort {$seen{$b} <=> $seen{$a}} keys %seen) {
  if ($seen{$keys} >=3){
  print "$keys = $seen{$keys}\n";
  }
}

My file loks like below. I am looking for the field with "PATTERN"

1362227457 TEST PATTERN none USER LOGIN1 1 INVALID USER
1362227691 TEST PATTERN2 none USER1 LOGIN2 1 INVALID USER
1362227784 TEST PATTERN none USER10 LOGIN3 1 INVALID USER
1362228006 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228101 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228328 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362229375 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED
1362230359 TEST PATTERN4 none USER LOGIN2 1 LOGIN FAILED
1362230359 a
1362230359 a
1362230359 a

Recommended Answers

All 2 Replies

Hi vivek.vivek,

Using hash in perl takes care of all issues you have in this your post. Please, check the solution below and feel free to ask any question.

#!/usr/bin/perl
use warnings;
use strict;

my %seen;

while (<DATA>) {
    chomp;
    my @data_filter = split;
    if ( defined $data_filter[2] and !exists $seen{ $data_filter[2] } ) {
        push @{ $seen{'needed'}{ $data_filter[2] } }, $_;
    }
}

for ( keys %{ $seen{'needed'} } ) {
    print join "\n" => @{ $seen{'needed'}{$_} }
      if @{ $seen{'needed'}{$_} } >= 2;
}

__DATA__
1362227457 TEST PATTERN none USER LOGIN1 1 INVALID USER
1362227691 TEST PATTERN2 none USER1 LOGIN2 1 INVALID USER
1362227784 TEST PATTERN none USER10 LOGIN3 1 INVALID USER
1362228006 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228101 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228328 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362229375 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED
1362230359 TEST PATTERN4 none USER LOGIN2 1 LOGIN FAILED
1362230359 a 
1362230359 a
1362230359 a

Your output then is:

1362227457 TEST PATTERN none USER LOGIN1 1 INVALID USER
1362227784 TEST PATTERN none USER10 LOGIN3 1 INVALID USER
1362228006 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228101 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228328 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362229375 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED

Hello Again,

If you know your PATTERN, you can write your script like so:

#!/usr/bin/perl
use warnings;
use strict;


print join "\n",grep {/\bPATTERN\b/} $_ while <DATA>;

__DATA__
1362227457 TEST PATTERN none USER LOGIN1 1 INVALID USER
1362227691 TEST PATTERN2 none USER1 LOGIN2 1 INVALID USER
1362227784 TEST PATTERN none USER10 LOGIN3 1 INVALID USER
1362228006 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228101 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362228328 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362229375 TEST PATTERN none USER LOGIN1 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED
1362230359 TEST PATTERN none USER LOGIN2 1 LOGIN FAILED
1362230359 TEST PATTERN4 none USER LOGIN2 1 LOGIN FAILED
1362230359 a 
1362230359 a
1362230359 a

Yes!, just one line and you get the same result, using REs in the perl function called grep. Hope this helps

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.