mapping the data in python

Question

parijat24 0 Newbie Poster

14 Years Ago

hi ,

my problem isthat i have two file s with this format

file 1 has two coloumns

protein id geneid

qqqq yyyy

tttt pppp

oooo llll

now i have one other file as cluster file as

cluster 1 : yyyy,pppp
cluster 2 : llll, yyyy,
.
.
.
.
cluster n : pppp,yyyy,llll

I want to map and find out cluster which belongs to qqqq instead of yyyy,tttt instaed of pppp...... and oooo instead of llll

python

3 Contributors
3 Replies
92 Views
11 Hours Discussion Span
Latest Post 14 Years Ago Latest Post by Beat_Slayer

All 3 Replies

Beat_Slayer 17 Posting Pro in Training

14 Years Ago

And how are those clusters. I see that you want to search for the value on the fisrt column of the first file, I just don't know what would be a valid match on the second one.

A cluster that doesn't have some values, tat has some values uin some order....

What?

Edited 14 Years Ago by Beat_Slayer because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 1 · 2010-07-07T00:11:26+00:00

Can you give exact math formula and/or example with few values and right result for that limited example.

Beat_Slayer 17 Posting Pro in Training · Answer 2 · 2010-07-07T03:41:00+00:00

This should do it. Yhere could be some tweaks depending on the data input.

proteins = open('protein.txt').readlines()
clusters = open('cluster.txt').readlines()

for protein in proteins:
    proteinid, geneid = protein.split(' ')
    geneid = geneid.rstrip('\n')
    for cluster in clusters:
        clusterid, genes = cluster.split(':')
        genes = genes.rstrip('\n')
        geneids = []
        geneids += genes.split(',')
        geneids = [x.lstrip() for x in geneids]
        if geneid in geneids:
            print protein.rstrip('\n')
            print cluster.rstrip('\n')
            print

mapping the data in python

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers