hi I am generating program to find keys in adictionary having same value in abig dictionary please see the example 

[B]cluster1 [/B]ENSTRUP00000000001 ENSTRUP00000000001 ENSTRUP00000001433 ENSTRUP00000030987 ENSTRUP00000031348 ENSTRUP00000033778 ENSTRUP00000034939 ENSTRUP00000036445 ENSTRUP00000041507 

[B]cluster2[/B] ENSTRUP00000000004 ENSTRUP00000000004 ENSTRUP00000000270 ENSTRUP00000004241 ENSTRUP00000010453 ENSTRUP00000012064 ENSTRUP00000015898 ENSTRUP00000019830 ENSTRUP00000024116 ENSTRUP00000026101 ENSTRUP00000027201 ENSTRUP00000028303 ENSTRUP00000028313 ENSTRUP00000029002 ENSTRUP00000031498 ENSTRUP00000032796 ENSTRUP00000032823 ENSTRUP00000033498 ENSTRUP00000036274 ENSTRUP00000037164 ENSTRUP00000037740 ENSTRUP00000038449 ENSTRUP00000038458 ENSTRUP00000039372 ENSTRUP00000040845 ENSTRUP00000043870 ENSTRUP00000043871 ENSTRUP00000043941 ENSTRUP00000044783 ENSTRUP00000046928 ENSTRUP00000047250 

[B]cluster3[/B] ENSTRUP00000000002 ENSTRUP00000000002 ENSTRUP00000000259 ENSTRUP00000000266 ENSTRUP00000000809 ENSTRUP00000001667 ENSTRUP00000003516 ENSTRUP00000004481 ENSTRUP00000007344 ENSTRUP00000008273 ENSTRUP00000012199 ENSTRUP00000012728 ENSTRUP00000013079 ENSTRUP00000013908 ENSTRUP00000016807 ENSTRUP00000019556 ENSTRUP00000020596 ENSTRUP00000023613 ENSTRUP00000030731 ENSTRUP00000031316 ENSTRUP00000032100 ENSTRUP00000033719 ENSTRUP00000035956 ENSTRUP00000036135 ENSTRUP00000036227 ENSTRUP00000037747 

[B]cluster4[/B] ENSTRUP00000000008 ENSTRUP00000000008 ENSTRUP00000000519 ENSTRUP00000002133 ENSTRUP00000008148 ENSTRUP00000008461 ENSTRUP00000008884 ENSTRUP00000010545 ENSTRUP00000018544 ENSTRUP00000022207 ENSTRUP00000022524 ENSTRUP00000023538 ENSTRUP00000024044 ENSTRUP00000026713 ENSTRUP00000027065 ENSTRUP00000032463 ENSTRUP00000034934 ENSTRUP00000038083 ENSTRUP00000038476 

[B]cluster5[/B] ENSTRUP00000000015 ENSTRUP00000000015 

[B]cluster6[/B] ENSTRUP00000000031 ENSTRUP00000000031 ENSTRUP00000003599 ENSTRUP00000016290 ENSTRUP00000025619 ENSTRUP00000028901 ENSTRUP00000033999 ENSTRUP00000034531 
  [B]cluster7[/B] ENSTRUP00000000004 ENSTRUP00000000004 ENSTRUP00000000270 ENSTRUP00000004241 ENSTRUP00000010453 ENSTRUP00000012064 ENSTRUP00000015898 ENSTRUP00000019830 ENSTRUP00000024116 ENSTRUP00000026101 ENSTRUP00000027201 ENSTRUP00000028303 ENSTRUP00000028313 ENSTRUP00000029002 ENSTRUP00000031498 ENSTRUP00000032796 ENSTRUP00000032823 ENSTRUP00000033498 ENSTRUP00000036274 ENSTRUP00000037164 ENSTRUP00000037740 ENSTRUP00000038449 ENSTRUP00000038458 ENSTRUP00000039372 ENSTRUP00000040845 ENSTRUP00000043870 ENSTRUP00000043871 ENSTRUP00000043941 ENSTRUP00000044783 ENSTRUP00000046928 ENSTRUP00000047250 

[B]cluster8[/B] ENSTRUP00000000002 ENSTRUP00000000002 ENSTRUP00000000259 ENSTRUP00000000266 ENSTRUP00000000809 ENSTRUP00000001667 ENSTRUP00000003516 ENSTRUP00000004481 ENSTRUP00000007344 ENSTRUP00000008273 ENSTRUP00000012199 ENSTRUP00000012728 ENSTRUP00000013079 ENSTRUP00000013908 ENSTRUP00000016807 ENSTRUP00000019556 ENSTRUP00000020596 ENSTRUP00000023613 ENSTRUP00000030731 ENSTRUP00000031316 ENSTRUP00000032100 ENSTRUP00000033719 ENSTRUP00000035956 ENSTRUP00000036135 ENSTRUP00000036227 ENSTRUP00000037747 

[B]cluster9[/B] ENSTRUP00000000008 ENSTRUP00000000008 ENSTRUP00000000519 ENSTRUP00000002133 ENSTRUP00000008148 ENSTRUP00000008461 ENSTRUP00000008884 ENSTRUP00000010545 ENSTRUP00000018544 ENSTRUP00000022207 ENSTRUP00000022524 ENSTRUP00000023538 ENSTRUP00000024044 ENSTRUP00000026713 ENSTRUP00000027065 ENSTRUP00000032463 ENSTRUP00000034934 ENSTRUP00000038083 ENSTRUP00000038476 

[B]cluster10[/B] ENSTRUP00000000015 ENSTRUP00000000015 

[B]cluster11[/B] ENSTRUP00000000031 ENSTRUP00000000031 ENSTRUP00000003599 ENSTRUP00000016290 ENSTRUP00000025619 ENSTRUP00000028901 ENSTRUP00000033999 ENSTRUP00000034531    

so I created the dictionary as where clusterterm is key and term start up with ENSTRUP... as values, I want to find out which cluster have same ENSTRUP........

I have written program but there is some problem it doesnot work

from sys import *
from collections import defaultdict
infile1 = open(argv[1],'r')
clusterlines = infile1.readlines()
a = []
b = []
my_dict = {}
reverse_dict = {}
for line in clusterlines:
        cluster = line.split()
        my_dict[cluster[0]] = tuple(cluster[1:])
        #print my_dict
for value in my_dict.values():
        reverse_dict[value]= []
        for key in my_dict.keys():
                if my_dict[key] == value:
                        if key not in reverse_dict[value]:reverse_dict[value].append(key)
        print reverse_dict
import sys
from collections import defaultdict
clusterlines = open(sys.argv[1],'r').readlines()
my_dict = {}
reverse_dict = defaultdict(list)
for line in clusterlines:
        cluster = map(str.strip,line.split())
        my_dict[cluster[0]] = tuple(cluster[1:])
        for cl in cluster[1:]:
            reverse_dict[cl].append(cluster[0])
print reverse_dict

I have tested it on the input, you provided. I removed the empty lines.

Edited 5 Years Ago by slate: n/a

You might want to consider 2 sets, as the logic may be easier to understand.
key_set = your current keys
value_set = your current values, each one added as an individual item

for key in key_set:     ## there are fewer keys than values
    if key in value_set:
        print key, "found"
#
# or
print key_set.intersection(value_set)

Edited 5 Years Ago by woooee: n/a

This article has been dead for over six months. Start a new discussion instead.