I've got a problem with replacing letters for a specific vector, stored in a file. The first file contains a list of "x = 0000101" like entries. The second file contains the target words in the 15th colom. I tried to use a dictionary containing the data of the first while and looping through the words in the second one, but did not succeed. Something like:

while t < 5000:
    line = string.split(words)[t])
    new = []
    for x in line[14]:
        if x in da.keys():

Could anyone give me a suggestion about how to proceed?

Edit: Added code field vegaseat

11 Years
Discussion Span
Last Post by Steko

Sure, the final product will be list of words coded in vector style. The word 'yo' from the 15th colom in the file would be coded as '0001100 0011100'. The vectors are stored in a seperate file and need to be merged with the words...


You could use a list of tuples.

# working with a list of tuples
tuple1 = ('yo', '0001100 0011100')
tuple2 = ('do', '0001101 0011100')
tuple3 = ('mm', '0001111 0011100')

list_of_tuples = [tuple1, tuple2, tuple3]
print list_of_tuples

# optional sort
print list_of_tuples

# search for a word (or a vector)
for tup in list_of_tuples:
    if 'mm' in tup:
        word, vector = tup
        print word, vector

Edited by vegaseat: fixed code tags


Thanks for your quick reply but I'm afraid it won't be very helpful. I will have the same problem to create the tupples. I'll try to explain my problem again:
One file contains 40 entries like 'c = 0100', 'a = 0001' and 't = 1000'. The other contains over 5000 coded like '##c#at'. The words need to be transformed into a binary vector like '000000000100000000011000', where # stands for '0000'. As I'm new to python I didn't succeed in simply looping through the files and creating the new vectors. Could anyone please give me a hint about how to tackle this problem?



Ok, here is a solution ;) It works if the two files look like

a = 0001
c = 0100
t = 1000



Here we go:

one = file("one.dat", "r")
two = file("two.dat", "r")

# read in the first file and create a dict:
d = {}
for line in one:
    # we remove the newline and split the line by =
    k,v = line.strip().split(" = ")
    d[k] = v
d["#"] = "0000"

for line in two:
    result = ""
    # remove the newline
    line = line.strip()
    # for every character in the line ...
    for char in line:
        # check if there is an entry for it in the dict
        if char in d:
            # if yes, add the value to the result-string
            result += d[char]
        else: pass
    # print the result
    print result

Hope this helps.

Regards, mawe

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.