User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Perl section within the Software Development category of DaniWeb, a massive community of 374,512 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 2,927 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Perl advertiser:
Jan 19th, 2007
Views: 7,016
A bare-bones code snippet to remove duplicate lines from a file. There are a number of ways to accomplish this task but this is a fast and dependable method using perls inplace editor and a simple hash to get the job done.

This probably should not be used for really big files, but files with a few thousand lines or even a few tens of thousands of lines should be OK. The bigger the file, the longer it may take to run.
Last edited : Jan 19th, 2007.
perl Syntax
  1. #!/usr/bin/perl
  2.  
  3. use strict;
  4. use warnings;
  5.  
  6. my $file = '/path/to/file.txt';
  7. my %seen = ();
  8. {
  9. local @ARGV = ($file);
  10. local $^I = '.bac';
  11. while(<>){
  12. $seen{$_}++;
  13. next if $seen{$_} > 1;
  14. print;
  15. }
  16. }
  17. print "finished processing file.";
Comments (Newest First)
sultan6928 | Newbie Poster | Jan 21st, 2008
perl -i.bac -ne "next if ++$seen{$_}>1; print;" file.txt

بسم الله الرحمن الرحيم
بسم الله الرحمن الرحيم
بسم الله
الرحمن
sultan6928 | Newbie Poster | Jan 21st, 2008
#!/usr/bin/perl use strict;use warnings; my $file = '/path/to/file.txt';my %seen = ();{ local @ARGV = ($file); local $^I = '.bac'; while(<>){ $seen{$_}++; next if $seen{$_} > 1; print; }}print "finished processing file.";#!/usr/bin/perl

use strict;
use warnings;

my $file = '/path/to/file.txt';
my %seen = ();
{
local @ARGV = ($file);
local $^I = '.bac';
while(<>){
$seen{$_}++;
next if $seen{$_} > 1;
print;
}
}
print "finished processing file.";



بسم الله الرحمن الرحيم
بسم الله
بسم الله الرحمن الرحيم
sultan6928 | Newbie Poster | Jan 21st, 2008
بسم الله الرحمن الرحيم
بسم الله الرحمن الرحيـم
بسم الله الرحمن الرحيم
MattEvans | Posting Shark | Jan 21st, 2007
Perl is quite bad (or good depending on how you look at it) for crypticness.

I never use those superglobal variables as implicit parameters or targets; it scares me

But I'd much rather be scared by something powerful at my potential disposal than irritated by the overhead and safety checks involved in doing alot of conceptually simple things in Java...

I guess they certainly aren't languages for the same purpose.. But hey; my college project involves string processing, and could definately make good use of untyped hashes, and it's gotta be done in Java. :mad:
KevinADC | Posting Pro | Jan 21st, 2007
It probably is a bit cryptic. But code is that way if you don't understand the syntax of a particular language. It could be written very cryptically as a one-liner. Something like (unchecked for accuracy):


perl -i.bac -ne "next if ++$seen{$_}>1; print;" file.txt 
MattEvans | Posting Shark | Jan 20th, 2007
That's madly tiny and cryptic :cheesy:

I've been working in Java today for a college project; I used Java for yeaars before I used Perl..

To do something similar to that in Java would be a mammoth task. There seems to be 'no such thing' as a useful Java hash, and reading files line by line isn't made easy either.

I certainly prefer the Perl way these days...
Post Comment

Only community members can submit or comment on code snippets. You must register or log in to contribute.

DaniWeb Marketplace (Sponsored Links)
All times are GMT -4. The time now is 2:32 pm.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC