0

Dear Experts,
Thanks for your time. I have got stuck on following problem. I have one file with five column name Chr, Pos Qial GT1 and GT2. In file Column 4 Gt1 is with 01 value only. I like to do comparision between col GT1 and col GT2. I like It has to check col 4 and 5 and print in output If 01 is followed by 00 as starting point and in another situation 00 followed by 01 as End point. For example In following sample, It has to print row (2) as starting point because it is followed by 00 in Col GT2. and row (4) as end point. Another cases would mentioned in expected outcome details. In few cases like row (10) and (13) where 01 situation is pre and followed by 00. So it first write this position as end point and same position as start point in next row.

Chr POS QUAL GT1 GT2
1 1 2556 96 01 01
2 1 1685 125 01 01
3 1 1770 80 01 00
4 1 1785 90 01 01
5 1 1810 95 01 01
6 1 1825 77 01 00
7 1 1835 80 01 00
8 1 1845 120 01 00
9 1 1875 125 01 00
10 1 1888 80 01 01
11 1 1910 95 01 00
12 1 1914 110 01 00
13 1 1935 65 01 01
14 1 1985 78 01 00
15 1 2030 100 01 01
16 1 2050 90 01 01

Expected Output

Start End
1685 1785
1810 1888
1888 1935
1935 2030

Edited by yksrmc

Attachments
Chr  POS  	QUAL	GT1 	 GT2
1      2556  	96     	01	  01
1      1685  	125 	 01 	  01
1      1770   	80    	01 	  00
1       1785  	90    	01 	  01
1       1810   	95   	01	  01
1        1825  	77  	01	  00
1        1835  	80    	01	  00
1        1845  	120   	01 	  00
1        1875  	125   	01   	  00
1        1888  	 80    	01	  01
1        1910  	95   	01	  00
1        1914  	110   	01   	  00
1        1935  	65   	01 	  01
1        1985  	78   	01   	  00
1         2030                100          01                01
1         2050                 90           01                01
2
Contributors
1
Reply
16
Views
4 Years
Discussion Span
Last Post by 2teez
0

Hello yksrmc,

This can easily be done by going through your data a line at a time. Since it's only the last number that is changing there is no need comparing it with the pervious figure before it.
Just take the number you wanted and check the condition you specify, if that is met, raise a flag, but if not take off the flag; like a flip-flop switch.

The code below solve the problem as you wanted it.

use warnings;
use strict;

use constant {
    START_IT => '01',
    END_IT   => '00',
};

<DATA>;    # read out the heading if you want

my $flag = 0;
my $avant_pt;

print sprintf "%s\t%s\n", "START", "END";

while (<DATA>) {
    my ( $pos, $gt2 ) = (split)[ 1, 4 ];
    if ( $gt2 == END_IT && ++$flag == 1 ) {
        print $avant_pt, "\t";
    }
    elsif ( $gt2 eq START_IT && $flag != 0 ) {
        print $pos, $/;
        $flag = 0;
    }
    $avant_pt = $pos;
}

__DATA__
Chr     POS   QUAL    GT1      GT2
1       2556      96      01    01
1       1685      125     01    01
1       1770      80      01    00
1       1785      90      01    01
1        1810     95      01    01
1        1825     77      01    00
1        1835     80      01    00
1        1845     120     01    00
1        1875     125     01    00
1        1888      80     01    01
1        1910      95     01    00
1        1914     110     01    00
1        1935      65     01    01
1        1985      78     01    00
1        2030     100     01    01
1        2050      90      01    01

Which gives the following output

START   END
1685    1785
1810    1888
1888    1935
1935    2030

I can only hope that this helps you.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.