0
ENSTRUG00000000009      ENSTRUT00000000011      1026    509     5896
ENSTRUG00000000011      ENSTRUT00000000014      420     63      482
ENSTRUG00000000012      ENSTRUT00000000015      10902   15313   93157
ENSTRUG00000000012      ENSTRUT00000000016      2844    23243   60985

as this is my input file it has five coloumns and there is for each line , we have to identify unique entry from the first coloumns which high value for the third coloumns like

ENSTRUG00000000009      ENSTRUT00000000012      1026    1503    6379
ENSTRUG00000000011      ENSTRUT00000000014      420     63      482
ENSTRUG00000000012      ENSTRUT00000000015      10902   15313   93157

my code is

from sys import *
import operator
file = open(argv[1],'r')
outfile = open(argv[2],'w')
buffer = []
gene = ''
cds = {}
rec = file.readlines()
for line in rec :
        field = line.split()
        if (gene != field[0]):
                header = field[0]#header is the variable caries the values
                print header,
                #outfile.writelines(header+"\t")
                gene = field[0]
                transcript = field[1]
                #print transcript
        cds[field[1]]=field[2]
        #print cds
        protein = max(cds.iteritems(), key=operator.itemgetter(1))[0]
        print protein

Edited by parijat24: n/a

2
Contributors
1
Reply
2
Views
6 Years
Discussion Span
Last Post by pyTony
0

I do not understand the first line of your result:

import sys
with open(sys.argv[1] if sys.argv[1:] else 'test.txt','r') as infile:
    with open(sys.argv[2] if sys.argv[1:] else 'test_out.txt','w') as outfile:
        rec = (line.split(None, 1) for line in sorted(infile, key=lambda x:int(x[47:55])))
        result = dict(rec)
        for key,item in sorted(result.items()):
            line = "%s      %s" % (key, item)
            print line,
            outfile.write(line)
"""Output:
ENSTRUG00000000009      ENSTRUT00000000011      1026    509     5896
ENSTRUG00000000011      ENSTRUT00000000014      420     63      482
ENSTRUG00000000012      ENSTRUT00000000015      10902   15313   93157
"""

Edited by pyTony: n/a

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.