Say I have a list of lists as follows (which I do):

[['Value1', 'Value2', 'Value3'], ['Value1', 'Value2', 'Value3'], ['Value1', 'Value2', 'Value3'], ['Value1', 'Value2', 'Value3']]

What I want to accomplish is if in any list Value1 & Value2 equal the Value1 & Value2 of any other list, I want to merge those two lists so only one is preserved and the two values for Value3 are summed (then append the list back to the list of lists).

I'm thinking along the lines of looping through the lists within the list, comparing the lists for matches where & are equal, and somehow merging them so that is summed. Anyone have an idea?

Recommended Answers

All 7 Replies

Your description is scrambled, do you want to remove duplicates from list?
Like:

[['Value1', 'Value2', 'Value3'], ['Value1', 'Value2', 'Value3'], ['Value1', 'Value2', 'Value3'], ['Value1', 'Value2', 'Value3']]
becomes     [['Value1', 'Value2', 'Value3']]

Sorry, I want

[['AAA', 'BBB', '111'], ['CCC', 'DDD', '222'], ['CCC', 'DDD', '333'], ['EEE', 'FFF', '444']]

to become

[['AAA', 'BBB', '111'], ['CCC', 'DDD', '555'], ['EEE', 'FFF', '444']]

Use sorted() and itertools.groupby()

from itertools import groupby

data = """
BBB AAA 111
CCC DDD 222
BBB AAA 333
EEE FFF  77
CCC DDD 99
"""

def pair(item):
    return tuple(item[:2])

L = [x.strip().split() for x in data.strip().splitlines()]
print L
R = [ [x, y, str(sum(int(z[2]) for z in g))] for (x, y), g in groupby(sorted(L, key=pair), key=pair) ]
print R

""" my output -->
[['BBB', 'AAA', '111'], ['CCC', 'DDD', '222'], ['BBB', 'AAA', '333'], ['EEE', 'FFF', '77'], ['CCC', 'DDD', '99']]
[['BBB', 'AAA', '444'], ['CCC', 'DDD', '321'], ['EEE', 'FFF', '77']]
"""

Slightly differently variant of Gribouillis' code as keeping the numbers as strings does not make much sense, using dictionary and set of keys, result as sorted list of dictionary items (not as requested):

data = """
BBB AAA 111
CCC DDD 222
BBB AAA 333
EEE FFF  77
CCC DDD 99
"""

def pair(item):
    return tuple(item[:2]), item[-1]

data_list = [pair(x.strip().split()) for x in data.strip().splitlines()]

# format with string for of integers
# is strange for this application, result as integer
summed = dict((item,sum(int(n) for key, n in data_list if key==item))
             for item in set(key for key,n in data_list))
# slightly different result format that makes more sense
print(sorted(summed.items()))

Still another version which doesn't need to sort the items (L could be any iterable)

from functools import reduce

def my_add(D, (x, y, z)):
    D[(x, y)] = int(z) + D.get((x, y), 0)
    return D

print reduce(my_add, L, dict())

""" my output -->
{('EEE', 'FFF'): 77, ('BBB', 'AAA'): 444, ('CCC', 'DDD'): 321}
"""

Many thanks both. I tried the first one and it works, so I'm solving the thread. I am looking at the sorted function to see if I could preserve the original sorting, and I'll explore the other two solutions as well. Thanks again.

Many thanks both. I tried the first one and it works, so I'm solving the thread. I am looking at the sorted function to see if I could preserve the original sorting, and I'll explore the other two solutions as well. Thanks again.

For keeping first occurance order look for OrderedDict

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.