I have a file like this:
a,z,1
b,y
c,x,1
d,w,1
e,v
f,u
What I need to do is to create a dictionary that has the characters in the first column as keys and characters in the third column as values. The rest should be ignored, i.e. {a:1, c:1, d:1}. This is what I have so far:

def create_dict(f):
    f = open("something.txt")
    d = {}
    for line in f:
        columns = line.split(",")
        letters = columns[0]
        numbers = columns[2]

I am confused about next steps. Please, help.

Recommended Answers

All 3 Replies

The only problem is the txt file must have 3 rows or the index becomes out of range.
Edit: You can use a Try Except block to only update the dict if there is a value in the 3ed row.

def create_dict():
    f = open("something.txt")
    d = {}
    for line in f:
        try:
            columns = line.split(",")
            letters = columns[0]
            numbers = columns[2]
            data = {letters:numbers}
            d.update(data)
        except(IndexError):
            pass

        print d

create_dict()

I usually whip up a little test program. Here you use the length of your columns list to avoid problems ...

# possible test data
data = """\
a,z,1
b,y
c,x,1
d,w,1
e,v
f,u"""

fname = "something.txt"

# write test data file ...
fout = open(fname, "w")
fout.write(data)
fout.close()


def create_dict(fname):
    fin = open(fname, "r")
    d = {}
    for line in fin:
        columns = line.split(",")
        if len(columns) > 2 :
            letter = columns[0]
            number = columns[2].rstrip()
            d[letter] = number
    return d

print( create_dict(fname) )  # {'a': '1', 'c': '1', 'd': '1'}

Resort to regular expressions. Begin with:

import re
s = file('something.txt', 'rt').read()

Then, if it's OK for the dictionary values to be strings, use:

d = dict( re.findall(r'^(\S),\S,(\d)$', s, re.M) )

Otherwise use:

d = dict( (key, int(val)) for (key, val) in re.findall(r'^(\S),\S,(\d)$', s, re.M) )

In both cases, the key is

re.findall(r'^(\S),\S,(\d)$', s, re.M)

It will extract only the lines that match the expected pattern (non-blank,non-blank,digit), and return a list of tuples containing the 1st and 3rd item. All that's left to do then is to convert these tuples into a dictionary of the desired format (string => string or string => int).

Depending on what your expected pattern is, you may substitute '\W' or '[A-Za-z]' for '\S'. Consult the re module docs for more information.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.