Hey guys,

I've been trying to solve this problem many ways now, and am really just not familiar enough with the language to see an elegant solution. I have a dictionary, where each value is itself a list of lists. For example:

keyA:([x1,y1,z1], [x2,y2,z2] etc...)

Within a key, I would like to sort these lists based on one value. For example, in keyA, I want to sort the entries by value 'y'. Therefore, keyA would still contain the same lists, but they would be arranged numerically from smallest to greatest based on the value of y1, y2 etc... The Y entires are always going to be integers and the sorting should be numerical.

The reason that I am doing this is I have a data file with entries:

NameA: x, y, z
NameA: x2, y2, z2
NameA: x3, y3, z3
NameB: x, y, z etc...

I am trying to sort the data within a set of 'Names'. The dictionary seemed an obvious choice, but I'm amenable to all other options.

That is just normal sort. What is your problem?
What you mean by big dictionary? Gigabytes?

Edited 5 Years Ago by pyTony: n/a

That is just normal sort. What is your problem?
What you mean by big dictionary? Gigabytes?

I guess big/large was a bad choice of words, I mean complicated. A simple dictionary would be key:value where value is a single entry; however, my value is a list of lists.

Key: [[list1][list2]...[list]]

For each key, I want to sort the values by a certain entry in each list. For example, say each list has values [x, y, z]. I want to sort the values so that the lists are ordered by value y, for example.

Here one loop for sorting one by one the values by each element, taking model of first list in value of one key.

korvatunturi = { 'Father Christmas': [[130, 123,12], [12,223,122]],
                 'Rudolf' : [[12,3,23], [235,24,3]] }

for index, element in enumerate(korvatunturi.values()[0][0]):
    print('Values sorted by %i' % index)
    for value in korvatunturi:
        korvatunturi[value].sort(key = lambda x: x[index])
    print(korvatunturi)

Edited 5 Years Ago by pyTony: n/a

Keys aren't ordered in dict, nothing to do with sorting the values, however, which are lists.

Keys aren't ordered in dict, nothing to do with sorting the values, however, which are lists.

Not true.
If the values were lists the dictionary would be sortable. However, the keys converted to a list can be read in any order when organized. But then you'll have a list or two. A dictionary is mostly like a keyed set, not a list.

You can sort a dictionary by key or value with the sorted() function

My book is wrong.

Dic data can be sorted even if it conytain lists in list. But you have to sort them out yourself. They are not sorted automaticaly.

The solution of tonyjv can be written in a single line. Only sort on one index here.

korvatunturi = { 'Father Christmas': [[130, 123,12], [12,223,122]],
                 'Rudolf' : [[12,3,23], [235,24,3]] }

dummy = [v.sort(key = lambda x: x[index]) for v in korvatunturi.values()]

Because the lists are sorted in place you don't need the result of the list comprehension, and no need to know the dictionary keys. List comprehension is faster then

for v in korvatunturi.values():
    v.sort(key = lambda x: x[index])

The solution of tonyjv can be written in a single line. Only sort on one index here.

korvatunturi = { 'Father Christmas': [[130, 123,12], [12,223,122]],
                 'Rudolf' : [[12,3,23], [235,24,3]] }

dummy = [v.sort(key = lambda x: x[index]) for v in korvatunturi.values()]

Because the lists are sorted in place you don't need the result of the list comprehension, and no need to know the dictionary keys. List comprehension is faster then

for v in korvatunturi.values():
    v.sort(key = lambda x: x[index])

It is maybe faster, but it is considered bad style to use list comprehension only for side effects. If it is faster, must be checked, as we prepare one list we never use. So if you want to use this form, this could be slightly better:

korvatunturi = { 'Father Christmas': [[130, 123,12], [12,223,122]],
                 'Rudolf' : [[12,3,23], [235,24,3]] }

for nothing in (v.sort(key = lambda x: x[index]) for v in korvatunturi.values()):
    pass

As readability suffers, usually it is better to avoid to use this form, if the slight posssible difference in speed is not critical.

I dont even see the need for list comprehension here.... The speed dif. here is minimal or nothing at all.

Readability is null. There are good occassions list comp. is good but not certainly this.

Edited 5 Years Ago by richieking: n/a

I just wanted to show the code can be written as a oneliner.

A list comprehension is faster then a generator if you need to construct a list based on another list. And certainly faster then a for-loop and list.append(). A generator is useful if you have a stop criteria and don't need all the elements of the source list. In this case it turns out that another construct is faster and more readable. The case is: we can modify the lists in place.

# benchmark of Dictionary sort
# http://www.daniweb.com/forums/thread333826.html

import random

def make_dict(numKeys=500, numElem=100):
    random.seed(123) # same dict everytime
    return dict([(k, [ [1,random.randint(0,5000),2] for i in range(numElem) ] ) for k in range(numKeys)])

test_dict = make_dict()

def sort_forloop():
    korvatunturi = make_dict()
    index = 1
    for v in korvatunturi.values():
        v.sort(key = lambda x: x[index])

def sort_generator():
    korvatunturi = make_dict()
    index = 1
    for nothing in (v.sort(key = lambda x: x[index]) for v in korvatunturi.values()):
        pass

def sort_listcomp():
    korvatunturi = make_dict()
    index = 1
    dummy = [v.sort(key = lambda x: x[index]) for v in korvatunturi.values()]

def listcomp():
    dummy = [v[50] for v in test_dict.values()]

def generator():
    for n in (v[50] for v in test_dict.values()):
        pass

if __name__=='__main__':
    from timeit import Timer
    count = 50
    t = Timer("make_dict()", "from __main__ import make_dict")
    dur_make = t.timeit(count)
    print "make_dict()", dur_make
    t = Timer("sort_forloop()", "from __main__ import sort_forloop")
    print "sort_forloop()", t.timeit(count)-dur_make
    t = Timer("sort_generator()", "from __main__ import sort_generator")
    print "sort_generator()", t.timeit(count)-dur_make
    t = Timer("sort_listcomp()", "from __main__ import sort_listcomp")
    print "sort_listcomp()", t.timeit(count)-dur_make
    count = 10000
    t = Timer("generator()", "from __main__ import generator")
    print "generator()", t.timeit(count)
    t = Timer("listcomp()", "from __main__ import listcomp")
    print "listcomp()", t.timeit(count)

the result of this timing program is on a 3GHz Pentium-4

make_dict() 17.8156466388
sort_forloop() 2.66801177733
sort_generator() 2.88505606159
sort_listcomp() 2.8512302139
generator() 1.80397921625
listcomp() 1.5282277765

A clear for-loop is the fastest if you only need the side-effect and not the elements.
The last two timings show that a generator is slower then a list comprehension.
And I think the generator construct looks less readable then the other two versions

This article has been dead for over six months. Start a new discussion instead.