We're a community of 1077K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,076,415 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Compare two files and change the chacter at position

Hello,
Could you please help me in following scenario in Perl scripting?
I want to compare two text files & change the charter at the position. Output of this comparision in third file with flags C-CHANGE, N-SAME at the end of line.

IN PUT1: 
    Posi      
      3    ATG   
      2    ACT
      1    ATC
      ........
      IN PUT2:
     ref  Multant 
      G    C 
      C    A
      A    A
      ........
OUT PUT:
    posi  Ref  Mul
      3   ATG  ATC  CHANGE
      2   ACT  AAT  CHANGE
      1   ATC  ATC  SAME
      .................
4
Contributors
7
Replies
1 Day
Discussion Span
1 Year Ago
Last Updated
8
Views
Question
Answered
biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0

Looks to me liek you are trying to compare DNA sequences.. in that case will it be correct to assume that the character sets will all be of length 3? Also i suspect these files would run into millions of rows then?

voidyman
Newbie Poster
22 posts since Sep 2011
Reputation Points: 10
Solved Threads: 1
Skill Endorsements: 0

Looks to me liek you are trying to compare DNA sequences.. in that case will it be correct to assume that the character sets will all be of length 3? Also i suspect these files would run into millions of rows then?

you right! I am try to compare DNA sequence. I know the positions where sequence and what kind Nuleotid were changed. I used excel to change chacter of sequence but I hope I can do it with Perl. Could you show me to do it?

biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0

I'm trying, but I can't figure out how your output follows from your input. Can you explain the problem more clearly?

Trentacle
Junior Poster
107 posts since Dec 2010
Reputation Points: 125
Solved Threads: 25
Skill Endorsements: 0

I'm trying, but I can't figure out how your output follows from your input. Can you explain the problem more clearly?

I can do it with excel by Replace comand. I hope I can do it with perl. For ex amino acid ATG was changed at position 3 of that amino acid and the charter was changed G by C. Out put: at postion 3 of ATG was changed --> ATC and label "Change". I hope it helps you understand my problem.

biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0

Ok, wait.

If your first file says what position you want to change, why does your second file say what character you want to change? If the "reference" character in file 2 is different from the character found at the position given in file 1, do you still change it to the "mutant" (not "multant") form?

I might also ask, how do you do it in Excel?

Trentacle
Junior Poster
107 posts since Dec 2010
Reputation Points: 125
Solved Threads: 25
Skill Endorsements: 0

input1.csv

3    ATG   
2    ACT
1    ATC

input2.csv

G    C 
C    A
A    A
#!/usr/bin/perl;
use strict;
use warnings;

my ($filename1, $filename2) = ('input1.csv', 'input2.csv');

open my $fh1, '<', $filename1 or die "Failed to open $filename1: $!";
open my $fh2, '<', $filename2 or die "Failed to open $filename2: $!";

while (my $rec1 = <$fh1>){
    defined (my $rec2 = <$fh2>) or last;
    print compare($rec1, $rec2), "\n";
}

sub compare{
    my ($str1, $str2) = @_;
    my ($pos, $triplet) = split(/\s+/, $str1);
    my ($ref, $mut) = split(/\s+/, $str2);
    my $idx = $pos - 1;#index starts at 0
    my $origtriplet = $triplet;
    my $origchar = substr($triplet, $idx, 1);
    my $stat;
    
    if ($origchar eq $ref){
        substr($triplet, $idx, 1) = $mut;
    }
    
    if ($origchar eq $mut){
        $stat = 'SAME';
    }
    else {
        $stat = 'CHANGE';
    }
    
    return "$origtriplet\t$triplet\t$stat";
}

Outputs

ATG	ATC	CHANGE
ACT	AAT	CHANGE
ATC	ATC	SAME
d5e5
Practically a Posting Shark
831 posts since Sep 2009
Reputation Points: 162
Solved Threads: 163
Skill Endorsements: 1

Thank you very much for your tutoral.
It is so good for me.

biojet
Junior Poster in Training
52 posts since Aug 2011
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0
Question Answered as of 1 Year Ago by Trentacle, d5e5 and voidyman

This question has already been solved: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
 
© 2013 DaniWeb® LLC
Page rendered in 0.0800 seconds using 2.69MB