Hello,
Could you please help me in following scenario in Perl scripting?
I want to compare two text files & change the charter at the position. Output of this comparision in third file with flags C-CHANGE, N-SAME at the end of line.

IN PUT1: 
    Posi      
      3    ATG   
      2    ACT
      1    ATC
      ........
      IN PUT2:
     ref  Multant 
      G    C 
      C    A
      A    A
      ........
OUT PUT:
    posi  Ref  Mul
      3   ATG  ATC  CHANGE
      2   ACT  AAT  CHANGE
      1   ATC  ATC  SAME
      .................

Recommended Answers

All 7 Replies

Looks to me liek you are trying to compare DNA sequences.. in that case will it be correct to assume that the character sets will all be of length 3? Also i suspect these files would run into millions of rows then?

Looks to me liek you are trying to compare DNA sequences.. in that case will it be correct to assume that the character sets will all be of length 3? Also i suspect these files would run into millions of rows then?

you right! I am try to compare DNA sequence. I know the positions where sequence and what kind Nuleotid were changed. I used excel to change chacter of sequence but I hope I can do it with Perl. Could you show me to do it?

I'm trying, but I can't figure out how your output follows from your input. Can you explain the problem more clearly?

I'm trying, but I can't figure out how your output follows from your input. Can you explain the problem more clearly?

I can do it with excel by Replace comand. I hope I can do it with perl. For ex amino acid ATG was changed at position 3 of that amino acid and the charter was changed G by C. Out put: at postion 3 of ATG was changed --> ATC and label "Change". I hope it helps you understand my problem.

Ok, wait.

If your first file says what position you want to change, why does your second file say what character you want to change? If the "reference" character in file 2 is different from the character found at the position given in file 1, do you still change it to the "mutant" (not "multant") form?

I might also ask, how do you do it in Excel?

input1.csv

3    ATG   
2    ACT
1    ATC

input2.csv

G    C 
C    A
A    A
#!/usr/bin/perl;
use strict;
use warnings;

my ($filename1, $filename2) = ('input1.csv', 'input2.csv');

open my $fh1, '<', $filename1 or die "Failed to open $filename1: $!";
open my $fh2, '<', $filename2 or die "Failed to open $filename2: $!";

while (my $rec1 = <$fh1>){
    defined (my $rec2 = <$fh2>) or last;
    print compare($rec1, $rec2), "\n";
}

sub compare{
    my ($str1, $str2) = @_;
    my ($pos, $triplet) = split(/\s+/, $str1);
    my ($ref, $mut) = split(/\s+/, $str2);
    my $idx = $pos - 1;#index starts at 0
    my $origtriplet = $triplet;
    my $origchar = substr($triplet, $idx, 1);
    my $stat;
    
    if ($origchar eq $ref){
        substr($triplet, $idx, 1) = $mut;
    }
    
    if ($origchar eq $mut){
        $stat = 'SAME';
    }
    else {
        $stat = 'CHANGE';
    }
    
    return "$origtriplet\t$triplet\t$stat";
}

Outputs

ATG	ATC	CHANGE
ACT	AAT	CHANGE
ATC	ATC	SAME

Thank you very much for your tutoral.
It is so good for me.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.