Several values to one key

Question

pyprog 0 Light Poster

14 Years Ago

I have a file of the following format:
a 1
a 2
a 3
b 4
b 5
b 6
c 7
c 8
c 9

Here is my code:

def file_to_dict(fname):
    f = open("file.txt")
    d = {}
    for line in f:
        columns = line.split(" ")
        letters = columns[0]
        numbers = columns[1].strip()
        d[letters] = list(numbers)
    print d
    
if __name__ == "__main__":
    fname = "file.txt"

The output must be {"a": ["1", "2", "3"], "b": ["4", "5", "6"], "c": ["7", "8", "9"]}. But my output shows only the last repeated key and its value, i.e. {"a": ["3"], "b": ["6"], "c": ["9"]}. Can you help?

python

5 Contributors
8 Replies
1K Views
1 Day Discussion Span
Latest Post 14 Years Ago Latest Post by pythopian

All 8 Replies

jlm699 320 Veteran Poster

14 Years Ago

Can you help?

Why, yes! This case would be a good one to use the dictionary's get method, which will allow you to determine if the key is already in the dictionary or not, and act accordingly.

def file_to_dict(fname):
    f = open("file.txt")
    d = {}
    for line in f:
        columns = line.split(" ")
        letters = columns[0]
        numbers = columns[1].strip()
        if d.get(letters):
            d[letters].append(numbers)
        else:
            d[letters] = list(numbers)
    print d
    
if __name__ == "__main__":
    fname = "file.txt"

Try that on for size.

Basically, by default get will return None if the key is not in the dictionary (you can pass a second parameter to mean default but I prefer None type). So first we check to see if the key is in the dictionary already. If it is, we use the list method append to add the new number onto the end of our list. If the get statement returned None, we instead do what you used to do (create a list as the value to the key).

HTH

jlm699 320 Veteran Poster

14 Years Ago

Wow, thanks a lot! The idea of using a method didn't cross my mind. But for one of my test cases I changed the numbers to two-digit numbers and now the first two-digit number in a list becomes "ripped" as in {a:}. I am just wondering how does this happen? How come it messes up only the first number?

Because a string is an iterable object. When using the list() method, it converts any such object to a list by iterating over it. Example:

>>> list( (1,2,3,4,5) )
[1, 2, 3, 4, 5]
>>> list( 'Hi my name is bob' )
['H', 'i', ' ', 'm', 'y', ' ', 'n', 'a', 'm', 'e', ' ', 'i', 's', ' ', 'b', 'o', 'b']
>>> [ 'Hi my name is bob' ]
['Hi my name is bob']
>>>

Just use the square brackets instead of list() and you should be good to go.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

pyprog 0 Light Poster · Answer 1 · 2009-11-11T02:44:33+00:00

Wow, thanks a lot! The idea of using a method didn't cross my mind. But for one of my test cases I changed the numbers to two-digit numbers and now the first two-digit number in a list becomes "ripped" as in {a:}. I am just wondering how does this happen? How come it messes up only the first number?

pythopian 10 Junior Poster in Training · Answer 2 · 2009-11-11T05:52:28+00:00

dict.setdefault is the way to go:

import re

def parse(text):
    items = re.findall('^(\w+)\s+(\d+)\s*$', text, re.M)
    data = {}
    for key, val in items:
        # data.setdefault(key, []).append(val) #<= add string values
        data.setdefault(key, []).append(int(val)) #<= add int values
    return data

Test:

#text = file('data.txt', 'rt').read()
text = '''\
a 1
a 2
a 3
b 4
b 5
b 6
c 7
c 8
c 9
'''
>>> print parse(text)
{'a': [1, 2, 3], 'c': [7, 8, 9], 'b': [4, 5, 6]}

masterofpuppets · Answer 3 · 2009-11-11T22:12:40+00:00

here's my version using the .get() method:

def file_to_dict( filename ):
    f = open( filename, "r" )
    lines = f.readlines()
    f.close()
    d = {}

    for line in lines:
        key = line.split( " " )[ 0 ]
        value = line.split( " " )[ 1 ].strip()
        d[ key ] = d.get( key, [] ) + [ value ]
    return d

#test...
>>> file_to_dict( "t.txt" )
{'a': ['1', '2', '3'], 'c': ['7', '8', '9'], 'b': ['4', '5', '6']}
>>>

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 4 · 2009-11-11T22:50:54+00:00

This might be a little easier to understand for a beginner (thanks to pythopian) ...

# parse text data into a dictionary using 
# split at newline and split at space

def parse2dict(text):
    data_dict = {}
    for line in text.split('\n'):
        if line:
            key, val = line.split()
            data_dict.setdefault(key, []).append(val)
    return data_dict

#text = file('data.txt', 'r').read()
text = """\
a 123
a 456
a 789
b 4
b 5
b 6
c 7
c 8
c 9
"""

print( parse2dict(text) )

"""
{'a': ['123', '456', '789'], 'c': ['7', '8', '9'], 'b': ['4', '5', '6']}
"""

pythopian 10 Junior Poster in Training · Answer 5 · 2009-11-12T00:19:10+00:00

here's my version using the .get() method ...

d.setdefault(key, []).append(value) is the preferred (and more efficient) python way to express
d[ key ] = d.get( key, [] ) + [ value ]. (Actually it's one of the recipes in the Python Cookbook.)

pythopian 10 Junior Poster in Training · Answer 6 · 2009-11-12T00:35:28+00:00

This might be a little easier to understand for a beginner (thanks to pythopian) ...

Vegaseat, you are right that you code is probably easier to understand for a beginner than mine. I'd like to add though that there also is a substantial semantic difference in function between the two:

Yours would fail with for lines containing unexpected patterns (ex. "ValueError: too many values to unpack" if there are more than 2 terms in the line). Mine would skip such lines. This is not to say that one behavior is better than the other, but the reader should be aware of the difference.

Several values to one key

Recommended Answers Collapse Answers

All 8 Replies

Recommended Answers