I have two lists: list1 and list2 listing a set of files. For each file in list1 if a corresponding file exists in list2, I want the script to print the columns of common files.

EX: list1............data1
1aw7_AB.fit => 1 2 3
1bjw_AB.fit => 9 4 7
1biq_AB.fit => 8 0 1


list2............data2
1aw7_AB.vcor => 4 5 6
1bjw_AB.vcor => 2 7 7

I want the output to be.....


1aw7_AB 1 2 3 1aw7_AB 4 5 6
1bjw_AB 9 4 7 1bjw_AB 2 7 7
1biq_AB 8 0 1 1bjw_AB 0 0 0 ( or something indicating no value for 0)

This is my script, I'm still learning to perl so it's really crude. It mostly works in it own crude way, but I'd like some help with the last part and suggestions to make it more efficient!
Help appreciated.

#!/usr/bin/perl -w

use warnings;

######## READING FILE ONE ##########
$[ = 1;
open(DATA, "< list1") ;
open(MYOUT1, "> out1") ;
while (<DATA>){
  @lines1 = <DATA>;
  foreach $line1 (@lines1){
  open (DATA1, "<$line1") or die "Failed to open $line1\n";
  while (<DATA1>){
   if ($. > 0){
    chomp;
   $ID1 = substr($line1, 1, 7);
   open (FILE, "$line1");
   @FId = split(' ', $_, -1);
   $cor = $FId[1];
   $icor = $FId[2];
   $ocor = $FId[3];

   }
  }
printf MYOUT1 "$ID1 $cor $icor $ocor\n" ;
  }
  }
  close (DATA);
  close (DATA1);
 close (FILE);
 close (MYOUT1);
  
####### READING FILE TWO ##########

open(DATA, "< list2" ) or die "Failed to open list1";
open(MYOUT2, "> out2") or die "Failed to open file2";
while (<DATA>){
  @lines2 = <DATA>;
  foreach $line2 (@lines2){
  open (DATA1, "<$line2") or die "Failed to open $line2\n";
  while (<DATA1>){
   if ($. > 0){
    chomp;
   $ID1 = substr($line2, 1, 7);
   open (FILE, "$line2");
   @FId = split(' ', $_, -1);
   $cor = $FId[1];
   $icor = $FId[2];
   $ocor = $FId[3];
   }
  }
printf MYOUT2 "$ID1 $cor $icor $ocor\n" ;
  }
}

  close (DATA);
  close (DATA1);
 close (FILE);
 close (MYOUT2);

######### PASTING THE TWO FILES TOGETHER IF IDs MATCH ######

open(VCOR, "< out1" ) or die "Failed to open vcor";
open(DAT, "< out2" ) or die "Failed to open data";
open(MYOUT, "> megafitfile") or die "Failed to open ";
while (<VCOR>){
 if ($.> 0){
   chomp;
   @FId = split(' ', $_, -1);
   $a=$FId[1];
   $b=$FId[2] ;
   $c=$FId[3] ;
   }
while (<DAT>){
if ($.> 0){
    chomp;
   @FId = split(' ', $_, -1);
   $d=$FId[1];
   $e=$FId[2] ;
   $f=$FId[3] ;
 
}
}
#### THIS PART DOES NOT WORK ######

if ($a eq $d){
print MYOUT  "$a $b $c $d $e $f\n" ;
}
else {print MYOUT "$a $b $c\n";}

}

close(VCOR);
close(DAT);
close(MYOUT);

Edited 5 Years Ago by newbie21: n/a

#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;

my %hash;

open my $fh, '<', '2.txt';
while (<$fh>){
    chomp;
    my @flds = split;
    my $fullname = $flds[0];
    my $name = fileparse($fullname,qw(.fit .vcor)); #Remove the file extension
    $hash{$name} = join ' ', @flds[2..$#flds];
}
close $fh;

open $fh, '<', '1.txt';
while (<$fh>){
    chomp;
    my @flds = split;
    my $fullname = $flds[0];
    my $name = fileparse($fullname,qw(.fit .vcor)); #Remove the file extension
    if (exists $hash{$name}){
        print "$name @flds[2..$#flds] $name $hash{$name}\n";
    }
    else{
        print "$name @flds[2..$#flds] in list1 matches nothing in list2\n";
    }
}

Gives the following output:

1aw7_AB 1 2 3 1aw7_AB 4 5 6
1bjw_AB 9 4 7 1bjw_AB 2 7 7
1biq_AB 8 0 1 in list1 matches nothing in list2

Edited 5 Years Ago by d5e5: n/a

Thanks D5E5,

your code helps me understand perl better, however when I run it I get the following error..

fileparse(): need a valid pathname at line 13,

I should have put more error-checking to test if it successfully opened the files. Here is an improved version. Make sure the files 1.txt and 2.txt (attached to this post) exist in your current working directory so the script can open them.

#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;

my %hash;
my $file1 = '1.txt';
my $file2 = '2.txt';

open my $fh, '<', $file2 or die "Can't open $file2: $!";
while (<$fh>){
    chomp;
    my @flds = split;
    my $fullname = $flds[0];
    my $name = fileparse($fullname,qw(.fit .vcor)); #Remove the file extension
    $hash{$name} = join ' ', @flds[2..$#flds];
}
close $fh;

open $fh, '<', $file1 or die "Can't open $file1: $!";
while (<$fh>){
    chomp;
    my @flds = split;
    my $fullname = $flds[0];
    my $name = fileparse($fullname,qw(.fit .vcor)); #Remove the file extension
    if (exists $hash{$name}){
        print "$name @flds[2..$#flds] $name $hash{$name}\n";
    }
    else{
        print "$name @flds[2..$#flds] in list1 matches nothing in list2\n";
    }
}
Attachments
1aw7_AB.fit => 1 2 3
1bjw_AB.fit => 9 4 7
1biq_AB.fit => 8 0 1
1aw7_AB.vcor => 4 5 6
1bjw_AB.vcor => 2 7 7

That works perfectly!! Thanks a lot for your help!!

You're welcome. Please don't forget to mark this topic 'solved'.

This question has already been answered. Start a new discussion instead.