Compare two files and change the chacter at position
Hello,
Could you please help me in following scenario in Perl scripting?
I want to compare two text files & change the charter at the position. Output of this comparision in third file with flags C-CHANGE, N-SAME at the end of line.
IN PUT1:
Posi
3 ATG
2 ACT
1 ATC
........
IN PUT2:
ref Multant
G C
C A
A A
........
OUT PUT:
posi Ref Mul
3 ATG ATC CHANGE
2 ACT AAT CHANGE
1 ATC ATC SAME
.................
biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0
Looks to me liek you are trying to compare DNA sequences.. in that case will it be correct to assume that the character sets will all be of length 3? Also i suspect these files would run into millions of rows then?
you right! I am try to compare DNA sequence. I know the positions where sequence and what kind Nuleotid were changed. I used excel to change chacter of sequence but I hope I can do it with Perl. Could you show me to do it?
biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0
I'm trying, but I can't figure out how your output follows from your input. Can you explain the problem more clearly?
I can do it with excel by Replace comand. I hope I can do it with perl. For ex amino acid ATG was changed at position 3 of that amino acid and the charter was changed G by C. Out put: at postion 3 of ATG was changed --> ATC and label "Change". I hope it helps you understand my problem.
biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0
input1.csv
3 ATG
2 ACT
1 ATC
input2.csv
G C
C A
A A
#!/usr/bin/perl;
use strict;
use warnings;
my ($filename1, $filename2) = ('input1.csv', 'input2.csv');
open my $fh1, '<', $filename1 or die "Failed to open $filename1: $!";
open my $fh2, '<', $filename2 or die "Failed to open $filename2: $!";
while (my $rec1 = <$fh1>){
defined (my $rec2 = <$fh2>) or last;
print compare($rec1, $rec2), "\n";
}
sub compare{
my ($str1, $str2) = @_;
my ($pos, $triplet) = split(/\s+/, $str1);
my ($ref, $mut) = split(/\s+/, $str2);
my $idx = $pos - 1;#index starts at 0
my $origtriplet = $triplet;
my $origchar = substr($triplet, $idx, 1);
my $stat;
if ($origchar eq $ref){
substr($triplet, $idx, 1) = $mut;
}
if ($origchar eq $mut){
$stat = 'SAME';
}
else {
$stat = 'CHANGE';
}
return "$origtriplet\t$triplet\t$stat";
}
Outputs
ATG ATC CHANGE
ACT AAT CHANGE
ATC ATC SAME
d5e5
Practically a Posting Shark
831 posts since Sep 2009
Reputation Points: 162
Solved Threads: 163
Skill Endorsements: 1
Thank you very much for your tutoral.
It is so good for me.
biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0
Question Answered as of 1 Year Ago by
Trentacle,
d5e5
and
voidyman