Python parse

Question

abhik1368 0 Newbie Poster

12 Years Ago

I have a file like this a csv file

DB01967 ZIPA
DB01967 PFAZ
DB01992 YVBK
DB01992 ZAP70
DB02191 ZIPA
DB02319 YQHD
DB02552 ZFPP

I want to print a file of csv in the format like

DB01967 ZIPA PFAZ
DB01992 YVBK ZAP70
DB02191 ZIPA
DB02319 YQHD
DB02552 ZFPP

i am totally new to python problem in parsing.

python

4 Contributors
3 Replies
130 Views
17 Hours Discussion Span
Latest Post 12 Years Ago Latest Post by TrustyTony

All 3 Replies

TrustyTony 888 pyMod

12 Years Ago

And here with itertools 'magic', as here seems to be time for present solutions:

from itertools import groupby
# space separated data
raw_data = '''DB01967 ZIPA
DB01967 PFAZ
DB01992 YVBK
DB01992 ZAP70
DB02191 ZIPA
DB02319 YQHD
DB02552 ZFPP'''.splitlines()

data_groups = groupby(raw_data, key = lambda x: x.split()[0])
for group, items in data_groups:
    print group, ' '.join(item.split()[1] for item in items)

Edited 12 Years Ago by TrustyTony

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

lrh9 95 Posting Whiz in Training · Answer 1 · 2012-04-09T03:59:28+00:00

The Python standard library features a csv module for CSV file reading and writing. Read the module documentation to learn how to use it.

I'd use a dictionary or collections.OrderedDict to associate each unique key with a list of values.

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 2 · 2012-04-09T18:46:25+00:00

A somewhat old fashioned approach that better shows what you have to do ...

# space separated data
raw_data = '''DB01967 ZIPA
DB01967 PFAZ
DB01992 YVBK
DB01992 ZAP70
DB02191 ZIPA
DB02319 YQHD
DB02552 ZFPP'''

# save the raw data as text
fname = 'data1.txt'
with open(fname, 'w') as fout:
    fout.write(raw_data)

# read the data back line by line
data_dict = {}
for line in open(fname, 'r'):
    # split the line into key and value at the space
    key, val = line.split()
    # form the dictionary and handle key collisions
    data_dict.setdefault(key, []).append(val)

# pretty print the dictionary (shows keys in order too)
import pprint
pprint.pprint(data_dict)

'''
{'DB01967': ['ZIPA', 'PFAZ'],
 'DB01992': ['YVBK', 'ZAP70'],
 'DB02191': ['ZIPA'],
 'DB02319': ['YQHD'],
 'DB02552': ['ZFPP']}
'''

print('-'*30)  # print 30 dashes

# convert dictionary to space separated data text
new_data = ""
space = " "
newline = "\n"
# sort the keys
for key in sorted(data_dict.keys()):
    new_data += key + space
    # iterate through each value list
    for val in data_dict[key]:
        new_data += val + space
    new_data += newline

print(new_data)

'''
DB01967 ZIPA PFAZ 
DB01992 YVBK ZAP70 
DB02191 ZIPA 
DB02319 YQHD 
DB02552 ZFPP 
'''

Python parse

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers