Hi all

I am newbie in python. I have a csv file with two columns first as source and second as target. There are multiple values assigned to keys in successive rows such as follows

1 -->4
1 -->5
1 -->8
1 -->12
2 -->4
2 -->17
2 -->14
2 -->46


I want to convert this into a format like dictionary or set as below
1 --> 4,5,8,12
2 --> 4,17,14,46


Can anyone help me please!


Thanks.

use statement like:

if this_key in my_dict:
    my_dict[this_key].add(this_value)
else:
     my_dict[this_key] = {this_value}

I do not have any special code!

Please refer attached sample file.

Thanks!

Attachments
a2	e45
a2	ghr55
a2	er456
a2	fgt9
p3	e45
p3	fgt9
p3	ft678
p3	lk89
j67	st90
j67 	hk90h
j67	er456
j67 	fgt9
import csv

d = {}

for row in csv.reader(open('sample.csv', "rb")):
    first, second = row[0].split('\t')
    if d.has_key(first.strip()):
        d[first.strip()] += [second.strip()]
    else:
        d[first.strip()] = [second.strip()] 

for k, v in d.iteritems():
    print k, v

Cheers, and Happy coding.

import csv

d = {}

for row in csv.reader(open('sample.csv', "rb")):
    first, second = row[0].split('\t')
    if d.has_key(first.strip()):
        d[first.strip()] += [second.strip()]
    else:
        d[first.strip()] = [second.strip()] 

for k, v in d.iteritems():
    print k, v

Cheers, and Happy coding.

Hi its giving following error

first, second = row[0].split('\t')
ValueError: need more than 1 value to unpack

Could you please advise me on this!

Thanks for reply!

That code worked for me, but here anyway my version of it:

d = {}

for row in open('sample.csv'):
    first, second = [ value.strip() for value in row.split('\t')]
    if first in d:
        d[first].append(second)
    else:
        d[first] = [second] 

for k, v in d.iteritems():
    print k, v
""" Output:
p3 ['e45', 'fgt9', 'ft678', 'lk89']
a2 ['e45', 'ghr55', 'er456', 'fgt9']
j67 ['st90', 'hk90h', 'er456', 'fgt9']
"""

Maybe header row problem?

Edited 6 Years Ago by pyTony: n/a

import csv

d = {}

for row in csv.reader(open('sample.csv', "rb")):
    first, second = row[0].split('\t')
    if d.has_key(first.strip()):
        d[first.strip()] += [second.strip()]
    else:
        d[first.strip()] = [second.strip()] 

for k, v in d.iteritems():
    print k, v

Cheers, and Happy coding.

ValueError problem solved.. just replaced row[0].split('\t') by only row[0:]

Please find code below. Please suggest better option for writing to file, if any.

import csv
f2 = open('sample.txt', 'w')


d = {}

for row in csv.reader(open('sample.csv',"rb")):
first, second = row[0:]
if d.has_key(first.strip()):
d[first.strip()] += [second.strip()]
else:
d[first.strip()] = [second.strip()]

for k,v in d.iteritems():
print "%s\t%s\t\n" % (k, v)
f2.write("%s\t%s\t\n"%(k,v))
f2.close()


Cheers! thanks for the reply.

That code worked for me, but here anyway my version of it:

d = {}

for row in open('sample.csv'):
    first, second = [ value.strip() for value in row.split('\t')]
    if first in d:
        d[first].append(second)
    else:
        d[first] = [second] 

for k, v in d.iteritems():
    print k, v
""" Output:
p3 ['e45', 'fgt9', 'ft678', 'lk89']
a2 ['e45', 'ghr55', 'er456', 'fgt9']
j67 ['st90', 'hk90h', 'er456', 'fgt9']
"""

Maybe header row problem?

Hi tonyjv thanks for the code.. it is giving the same ValueError even after removing the header row.

Can you post a real sample CSV?

Hi
Here is a real csv file.

Cheers!

Attachments
name1	name2
AAF	GBP
ACE2	HMOX1
AFP1	AFP
AFP1	ALB
C/EBPbeta(p35)	CRP
C/EBPbeta(p35)	FOS
C/EBPbeta(p35)	IL12B
C/EBPbeta(p35)	IL6
AhR	HRAS
AIC2	apoAI
AIC3	apoAI
AIC4	apoAI
AIC5	apoAI
AID2	apoAI
AP-1	ALDOA
AP-1	BDKRB1
AP-1	HBB
AP-1	C3AR1
AP-1	CCL2
AP-1	CCL2
AP-1	CCL4
AP-1	Ccnd1
AP-1	CD226
AP-1	CD38
AP-1	CD38
AP-1	CHIT1
AP-1	c-myc
AP-1	COL1A1
AP-1	CYP3A4
AP-1	DBH
AP-1	EDN1
AP-1	ELN
AP-1	ELN
AP-1	F3
AP-1	F3
AP-1	FOS
AP-1	FOS
AP-1	Gba
AP-1	GCLC
AP-1	GFAP
AP-1	GJA1
AP-1	CSF2
AP-1	CSF2
AP-1	CSF2
AP-1	CSF2
AP-1	CSF2
AP-1	CSF2
AP-1	HBA1
AP-1	HBA1
AP-1	HBA1
AP-1	HMGA1
AP-1	HMGA1
AP-1	ICAM1
AP-1	ICAM1
AP-1	ICAM1
AP-1	IFNG
AP-1	IL2
AP-1	IL2
AP-1	IL-3
AP-1	IL6
AP-1	IL8
AP-1	IL8
AP-1	ITGAX
AP-1	ITGAX
AP-1	IVL
AP-1	IVL
AP-1	JUN
AP-1	JUN
AP-1	KRT16
AP-1	LOR
AP-1	CSF1R
AP-1	MMP1
AP-1	MMP1
AP-1	MMP13
AP-1	MMP3
AP-1	MMP-9
AP-1	MT-IIA
AP-1	MT-IIA
AP-1	MYLK
AP-1	MYLK
AP-1	MYLK
AP-1	MYLK
AP-1	NAT1
AP-1	NOS2
AP-1	NOS2
AP-1	NPY
AP-1	OXTR
AP-1	TP53
AP-1	HMBS
AP-1	PDGFA
AP-1	ENK
AP-1	prl
AP-1	prl
AP-1	prl
AP-1	prl
AP-1	prl
AP-1	POLA1
AP-1	Rbp
AP-1	SERPINA3
AP-1	SERPINB2
AP-1	SFTPD
AP-1	SMAD7
AP-1	SP3
AP-1	SPRR1B
AP-1	SPRR1B
AP-1	SPRR2A
AP-1	SPRR3
AP-1	TBXAS1
AP-1	TCR-beta
AP-1	TFF1
AP-1	TFF1
AP-1	TGFB1
AP-1	TGFB1
AP-1	TH
AP-1	TH
AP-1	TH
AP-1	TIMP-1
AP-1	TNFRSF10A
AP-1	TNFRSF10A
AP-1	TSG-6
AP-1	uPA
AP-1	uPA
AP-1	VCAM1
AP-1	VEGFA
AP-1	MMP13
AP-1	MMP3
AP-1	prl
AP-1	ALDOA
AP-1	CYP11A1
AP-1	CYP11A1
AP-1	F3
AP-1	FOS
AP-1	HBA1
AP-1	IL2
AP-1	IL6
AP-1	MMP12
AP-1	MT-IIA
AP-1	HMBS
AP-1	TNF
AP-1	VIM
AP-2alpha	MT-IIA
AP-2alpha	TFAP2A
AP-2	FN1
AP-2	GH1
AP-2	ODC1
AP-2alphaA	ADM
AP-2alphaA	apoB
AP-2alphaA	apoB
AP-2alphaA	ATF2
AP-2alphaA	CEACAM1
AP-2alphaA	C3
AP-2alphaA	CCNB1
AP-2alphaA	CEBPA
AP-2alphaA	CGB
AP-2alphaA	CGB
AP-2alphaA	CGB
AP-2alphaA	CGB
AP-2alphaA	c-myc
AP-2alphaA	c-myc
AP-2alphaA	HSD17B1
AP-2alphaA	GFAP
AP-2alphaA	GFAP
AP-2alphaA	GH1
AP-2alphaA	GH1
AP-2alphaA	CGA
AP-2alphaA	HSPB1
AP-2alphaA	IFNG
AP-2alphaA	JUN
AP-2alphaA	JUN
AP-2alphaA	KIT
AP-2alphaA	KRT1
AP-2alphaA	KRT14
AP-2alphaA	KRT6B
AP-2alphaA	MIP
AP-2alphaA	MIP
AP-2alphaA	MMP2
AP-2alphaA	MMP2
AP-2alphaA	MT-IIA
AP-2alphaA	MT-IIA
AP-2alphaA	MT-IIA
AP-2alphaA	MT-IIA
AP-2alphaA	MT-IIA
AP-2alphaA	MT-IIA
AP-2alphaA	ODC1
AP-2alphaA	ENK
AP-2alphaA	PRKCA
AP-2alphaA	PRKCA
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	REL
AP-2alphaA	TCR-alpha
AP-4	APH1A
AP-4	MT-IIA
AP-4	ENK
AP-4	TARBP2
AR	CDKN1A
AR	factor
AR	IGFBP3
AR	KLK3
AR	KLK3
AR	MMP2
AR	MMP2
AR	PIGR
ARG80	FOS
COUP-TF2	apoAI
COUP-TF2	apoB
COUP-TF2	apoB
COUP-TF2	apoB
COUP-TF2	apoC-II
COUP-TF2	apoCIII
COUP-TF2	CETP
COUP-TF2	CYP11B2
COUP-TF2	CYP11B2
COUP-TF2	CYP7A1
COUP-TF2	factor
COUP-TF2	factor
COUP-TF2	HNF4A
COUP-TF2	NR3C1
ATBF1-B	AFP
ATF	CGA
ATF	FOS
ATF	CGA
ATF	RB1
ATF	RN7SL
ATF	VIP
BP1	HBB
BP1	HBB
BP2	HBB
Pax-5	CD19
Pax-5	CD19
Pax-5	FCER2
Pax-5	FCER2
gammaCAAT	GLOB-AG
gammaCAC1	GLOB-AG
gammaCAC2	GLOB-AG
CAC-binding	HBB
CAC-binding	HBB
CAC-binding	HBB
CAC-binding	HBA1
CAC-binding	HMBS
CAC-binding	HMBS
CAC-binding	HMBS
CACCC-binding	apoB
CACCC-binding	apoB
CACCC-binding	HBB
CACCC-binding	GLOB-AG
CACCC-binding	HBA1
CACCC-binding	ITGA2B
alpha-CBF	CGA
alpha-CBF	HAND1
CBF(2)	apoAI
CBF	ACTC1
NF-YA	CDC2
NF-YA	SOX9
NF-YA	SOX9
NFI/CTF	HSPA1A
NFI/CTF	LHX3
NFI/CTF	NPY
NFI/CTF	P2RX1
NFI/CTF	PIGR
NFI/CTF	SLC25A5
NFI/CTF	SLC25A5
CDP	CYBB
CDP	CYBB
CDP	CYBB
CDP	CYBB
CDP	CYBB
CDP	CYBB
CDP	GLOB-AG
CD28RC	CSF2
CD28RC	IL2
CD28RC	IL2
CD28RC	IL-3
C/EBPalpha	apoB
C/EBPalpha	apoB
C/EBPalpha	IL12B
C/EBPalpha	IL8
C/EBPalpha	SERPINC1
C/EBPalpha	ALOX5AP
C/EBPalpha	ALOX5AP
C/EBPalpha	APOA2
C/EBPalpha	apoB
C/EBPalpha	BCL2
C/EBPalpha	BCL2
C/EBPalpha	CD14
C/EBPalpha	CES1
C/EBPalpha	CHI3L1
C/EBPalpha	CSF1
C/EBPalpha	CYP19A1
C/EBPalpha	CYP2A13
C/EBPalpha	CYP2A13
C/EBPalpha	CYP3A4
C/EBPalpha	FGA
C/EBPalpha	factor
C/EBPalpha	G-CSF
C/EBPalpha	CSF2RA
C/EBPalpha	ICAM1
C/EBPalpha	IL10
C/EBPalpha	IL10
C/EBPalpha	IL10
C/EBPalpha	IVL
C/EBPalpha	LTF
C/EBPalpha	PTGS2
C/EBPalpha	PTGS2
C/EBPalpha	S100A9
C/EBPalpha	SERPINC1
C/EBPalpha	SERPINE1
C/EBPalpha	StAR
C/EBPalpha	TF
C/EBPalpha	TLR9
C/EBPalpha	ADH1
C/EBPalpha	ADH1
C/EBPalpha	ADH1
C/EBPalpha	ADH1
C/EBPalpha	ADH1B
C/EBPalpha	ADH1B
C/EBPalpha	ADH1B
C/EBPalpha	ADH1B
C/EBPalpha	ADH1B
C/EBPalpha	ADH3
C/EBPalpha	ADH3
C/EBPalpha	ADH3
C/EBPalpha	ADH3
C/EBPalpha	ADH3
C/EBPalpha	ALB
C/EBPalpha	apoB
C/EBPalpha	C3
C/EBPalpha	DDIT3
C/EBPalpha	F8
C/EBPalpha	F8
C/EBPalpha	factor
C/EBPalpha	HP
C/EBPalpha	HP
C/EBPalpha	Hpx
C/EBPalpha	INSR
C/EBPalpha	INSR
C/EBPalpha	PPARG
C/EBPalpha	SFTPD
C/EBPalpha	SFTPD
C/EBPalpha	SFTPD
C/EBPalpha	TF
c-Ets-1	Ccnd1
c-Ets-1	IL12B
c-Ets-1	TFRC
c-Ets-1	TFRC
c-Ets-1	TNF
c-Ets-1	TNF
c-Ets-1	TNF
c-Ets-1	CD53
c-Ets-1	CD8A
c-Ets-1	CSNK2B
c-Ets-1	ECE1
c-Ets-1	ECE1
c-Ets-1	FOSL1
c-Ets-1	IL2RB
c-Ets-1	ITGAX
c-Ets-1	ITGAX
c-Ets-1	MMP1
c-Ets-1	NFKB1
c-Ets-1	PTHLH
c-Ets-1	TCR-alpha
c-Ets-1	TCR-Vbeta
c-Ets-1	TFAP2A
c-Ets-1	TFRC
c-Ets-1	TIMP-1
c-Ets-1	TNF
c-Ets-1	TNF
c-Ets-1	TNF
c-Ets-1	TNF
c-Ets-2	Ccnd1
c-Ets-2	CDC2
c-Ets-2	CDC2
c-Ets-2	CDC2
c-Ets-2	FOSL1
c-Ets-2	IL12B
c-Ets-2	IL12B
c-Ets-2	KIT
c-Ets-2	KIT
c-Ets-2	KIT
c-Ets-2	MGAT2
c-Ets-2	PSEN1
c-Ets-2	SURF-1/SURF-2
c-Ets-2	TCR-beta
c-Ets-2	uPA
c-Ets-1	C3AR1
c-Ets-1	ITGA2B
c-Ets-1	TCR-beta
c-Ets-1	TCR-beta
c-Fos	ACP5
c-Fos	IL2
c-Fos	MMP1
c-Fos	MT-IIA
c-Fos	NQO1
c-Fos	VIP
c-Fos	HBB
c-Fos	C3AR1
c-Fos	Ccnd1
c-Fos	CD38
c-Fos	CD44
c-Fos	CYP19A1
c-Fos	CYP2J2
c-Fos	DBH
c-Fos	EDN1
c-Fos	FGFBP1
c-Fos	FOS
c-Fos	FOSL1
c-Fos	GNAI2
c-Fos	GNRHR
c-Fos	ICAM1
c-Fos	IL11
c-Fos	IL2
c-Fos	KRT16
c-Fos	HLA-DPB
c-Fos	HLA-DRA
c-Fos	MMP1
c-Fos	MMP1
c-Fos	MMP1
c-Fos	MMP13
c-Fos	MMP2
c-Fos	MT-IIA
c-Fos	MUC2
c-Fos	NAT1
c-Fos	OPRM1
c-Fos	OPRM1
c-Fos	TP53
c-Fos	ENK
c-Fos	PTGS2
c-Fos	SERPINB9
c-Fos	uPA
c-Fos	VIM
c-Fos	WEE1
c-Fos	CGA
c-Fos	MMP3
c-Jun	MT-IIA
c-Jun	NQO1
c-Jun	PTGS2
c-Jun	TFRC
c-Jun	TNF
c-Jun	TNF
c-Jun	VIM
c-Jun	CGA
c-Jun	IL2
c-Jun	MMP3
c-Jun	MT-IIA
c-Jun	SSTR2
c-Jun	TGFB1
c-Jun	TGM1
c-Jun	TGM1
c-Jun	TNF
c-Jun	TSHB
c-Jun	APP
c-Jun	ATF3
c-Jun	BCL2A1
c-Jun	HBB
c-Jun	BRCA1
c-Jun	C3AR1
c-Jun	CCL5
c-Jun	Ccnd1
c-Jun	CD44
c-Jun	CD82
c-Jun	CSPG2
c-Jun	CYP19A1
c-Jun	CYP2J2
c-Jun	EDN1
c-Jun	FGF2
c-Jun	GJA1
c-Jun	GNRHR
c-Jun	GSS
c-Jun	GSS
c-Jun	GSTP1
c-Jun	HK1
c-Jun	ICAM1
c-Jun	IFNG
c-Jun	IL11
c-Jun	IL2
c-Jun	IL-5
c-Jun	IL8
c-Jun	IVL
c-Jun	IVL
c-Jun	JUN
c-Jun	JUN
c-Jun	KRT16
c-Jun	MMP1
c-Jun	MMP1
c-Jun	MMP1
c-Jun	MMP1
c-Jun	MMP13
c-Jun	MMP13
c-Jun	MMP2
c-Jun	MSH2
c-Jun	MSH2
c-Jun	MT-IIA
c-Jun	MUC2
c-Jun	NAT1
c-Jun	TP53
c-Jun	ENK
c-Jun	PTGS2
c-Jun	SELE
c-Jun	SERPINB9
c-Jun	SLC3A2
c-Jun	STAT4
c-Jun	TERT
c-Jun	TERT
c-Jun	TGM1
c-Jun	TGM1
c-Jun	TNF
c-Jun	TNF
c-Jun	TNFRSF10A
c-Jun	uPA
c-Jun	uPA
c-Jun	VIM
c-Jun	VIP
c-Myb	ADA
c-Myb	c-myb
c-Myb	GSTP1
c-Myb	MIRN15A
c-Myb	MIRN15A
c-Myb	NR3C1
c-Myb	NR3C1
c-Myb	SIM2
c-Myb	SP3
c-Myb	TRHR
c-Myb	TRHR
c-Myb	TRHR
c-Myb	TRHR
c-Myb	TRHR
c-Myb	TRHR
c-Myc	cdc25A
c-Myc	cdc25A
c-Myc	cdc25A
c-Myc	CDK4
c-Myc	CDK4
c-Myc	CDK4
c-Myc	c-myc
c-Myc	CXCR4
c-Myc	EIF4E
c-Myc	EIF4E
c-Myc	MIRN17
c-Myc	MrDb
c-Myc	neu
c-Myc	ODC1
c-Myc	ODC1
c-Myc	TERT
c-Myc	TERT
c-Myc	TERT
c-Myc	TERT
c-Myc	TERT
c-Myc	YBX1
c-Myc	EIF4E
c-Myc	EIF4E
c-Myc	PIGR
COUP	apoA-IV
COUP-TF1	apoAI
COUP-TF1	CYP11B2
COUP-TF1	CYP11B2
COUP-TF1	CYP19A1
COUP-TF1	factor
COUP-TF1	factor
COUP-TF1	LIPC
COUP-TF1	LIPC
NF-Y	COL1A2
NF-Y	CAT
NF-Y	CBS
NF-Y	CDKN1B
NF-Y	COL1A1
NF-Y	CYBB
NF-Y	CYBB
NF-Y	CYBB
NF-Y	F10
NF-Y	FAS
NF-Y	FN1
NF-Y	FTH1
NF-Y	FXR2
NF-Y	GLOB-AG
NF-Y	GLOB-AG
NF-Y	GLOB-AG
NF-Y	GLOB-AG
NF-Y	GLOB-AG
NF-Y	HBA1
NF-Y	HBE1
NF-Y	HBE1
NF-Y	HOXB7
NF-Y	HSPA1A
NF-Y	HSPA1A
NF-Y	HLA-DRA
NF-Y	HLA-DRB
NF-Y	ABCB1
NF-Y	ABCB1
NF-Y	MYBL1
NF-Y	TP53
NF-Y	POLA1
NF-Y	PTTG1
NF-Y	SCD
NF-Y	SOX14
NF-Y	SP1
NF-Y	SP1
NF-Y	STMN1
NF-Y	TARBP2
NF-Y	TGFBR2
NF-Y	tk1
NF-Y	TPH1
CP2-isoform1	HBB
CP2-isoform1	HBZ
CP2-isoform1	PAX6
CP2	IL-4
NF-YB	ATP1A3
NF-YB	ACTB
NF-YB	CBS
NF-YB	CDC2
NF-YB	CDKN1B
NF-YB	COL5A3
NF-YB	CTSL
NF-YB	EDF1
NF-YB	EPHX1
NF-YB	FCGR2A
NF-YB	FPGS
NF-YB	GPC3
NF-YB	GPC3
NF-YB	HBA1
NF-YB	ABCB1
NF-YB	MJD
NF-YB	MYBL1
NF-YB	PNRC1
NF-YB	PTPN6
NF-YB	SCGB2A1
NF-YB	SP3
NF-YB	SP3
NF-YB	TGFBR2
NF-YB	tk1
CREB	ADRB2
CREB	BDNF
CREB	CCL5
CREB	Ccnd1
CREB	Ccnd1
CREB	CFTR
CREB	CRH
CREB	CCNA2
CREB	DBH
CREB	EGR1
CREB	EGR1
CREB	FN1
CREB	FN1
CREB	FN1
CREB	FOS
CREB	CGA
CREB	IGFBP1
CREB	INS
CREB	INS
CREB	LOR
CREB	HLA-DPB
CREB	HLA-DRA
CREB	HLA-DRB
CREB	MIRN132
CREB	MITF
CREB	PSEN1
CREB	PTH
CREB	QM
CREB	RNU4C
CREB	SLC25A3
CREB	TCR-Vbeta
CREB	TCR-Vbeta
CREB	tPA
CREB	TRH
CREB	VIP
CREB1	POLB
CREB1	FOS
CREB1	CGA
CREB1	IL1B
CREB1	ENK
CREB1	PTH
CREB1	tPA
CREB1	TXN
CREB1	VIP
ATF-2-xbb4	ATF2
ATF-2-xbb4	ATF2
ATF-2-xbb4	ATF3
ATF-2-xbb4	CCL5
ATF-2-xbb4	CCNA2
ATF-2-xbb4	POLB
ATF-2-xbb4	FN1
ATF-2-xbb4	FOS
ATF-2-xbb4	CGA
ATF-2-xbb4	IFNB1
ATF-2-xbb4	ENK
ATF-2-xbb4	RB1
ATF-2-xbb4	SELE
ATF-2-xbb4	TGFB2
ATF-2-xbb4	TNF
ATF-2-xbb4	tPA
ATF-2-xbb4	uPA
ATF-2-xbb4	VIP
c-Rel	BCL2A1
c-Rel	IFNB1
c-Rel	IL12B
c-Rel	IL12B
c-Rel	IL-2Ralpha
c-Rel	IL-2Ralpha
c-Rel	IRF4
c-Rel	IRF4
c-Rel	IRF4
c-Rel	Tap1
c-Rel	TNF
c-Rel	IgkB
c-Rel	IL12B
c-Rel	IL12B
cebpe	F7
NF-1C	ACTC1
NF-1C	HBB
NF-1C	C4A
NF-1C	CBS
NF-1C	CDKN1A
NF-1C	COL1A1
NF-1C	CSH1
NF-1C	CTLA4
NF-1C	CYB5
NF-1C	CYP17A1
NF-1C	CYP17A1
NF-1C	CYP3A4
NF-1C	CYP3A7
NF-1C	F13A1
NF-1C	FABP7
NF-1C	FABP7
NF-1C	HBA1
NF-1C	HMGB1
NF-1C	HMGB1
NF-1C	HSPA1A
NF-1C	HSPA1A
NF-1C	HSPA1A
NF-1C	JUN
NF-1C	POLA1
NF-1C	SCGB2A1
NF-1C	SERPINE1
NF-1C	SLC25A5
NF-1C	SLC25A5
NF-1C	SP3
NF-1C	TF
CTF-1	HBA1
CTF-1	HRAS
CTF-1	TP53
CTF-1	TP53
CTF-2	CYP17A1
CTF-2	CYP17A1
CTF-2	HBA1
CTF-2	HMGB1
CTF-2	HMGB1
CTF-2	HRAS
CTF-3	HBA1

I presume your using python 3, because the sample file works.

You can try like this bellow, or go for a aproach without the module as tonyvj said.

import csv

d = {}

for row in csv.reader(open('sample.csv')):
    first, second = [value.strip() for value in row[0].split('\t')]
    if d.has_key(first.strip()):
        d[first].append(second)
    else:
        d[first] = [second] 

f = open('output.txt', 'w')
for k, v in d.iteritems():
    print k, v
    f.write('%s %s\n' %(k, v))
f.close()

Cheers.

Happy coding!

This article has been dead for over six months. Start a new discussion instead.