hi ,
i want to sort my two merged iterable with heapq.
everything is good but in result which starts with Turkish character locate end of all result.

exm: Ö,Ş,İ ... locate end of result

how can i solve this problem?

Recommended Answers

All 11 Replies

Your question did not make it into the English language well enough to understand. Also: Please post your code (which should be short, if possible). Be sure to use the CODE button.

i am using below code merging two iterable , everything is good but some words which starts with Ö - Örnek, Ş - Şelale , İ-İstanbul, Ü - Ürgüp ... not sorted in correct place "ö" letter must come after "o", "ş" letter must come after 's'...

in result A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, Ç, İ, Ö, Ş , Ü .....

it must be A.C, Ç,... I, İ,....O, Ö,.... U, Ü,.....

sorry for my eng..

## {{{ http://code.activestate.com/recipes/491285/ (r3)
import heapq

def imerge(*iterables):
    '''Merge multiple sorted inputs into a single sorted output.

    Equivalent to:  sorted(itertools.chain(*iterables))

    >>> list(imerge([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
    [0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]

    '''
    heappop, siftup, _StopIteration = heapq.heappop, heapq._siftup, StopIteration

    h = []
    h_append = h.append
    for it in map(iter, iterables):
        try:
            next = it.next
            h_append([next(), next])
        except _StopIteration:
            pass
    heapq.heapify(h)

    while 1:
        try:
            while 1:
                v, next = s = h[0]      # raises IndexError when h is empty
                yield v
                s[0] = next()           # raises StopIteration when exhausted
                siftup(h, 0)            # restore heap condition
        except _StopIteration:
            heappop(h)                  # remove empty iterator
        except IndexError:
            return


if __name__ == '__main__':
    import doctest
    doctest.testmod()
## end of http://code.activestate.com/recipes/491285/ }}}

You must use locale and cmp parameter for sorting:

# -*- coding: cp1254 -*-
import locale
print locale.getlocale()
locale.setlocale(locale.LC_ALL, locale='Turkish_Turkey')
print locale.getlocale()

names = 'Mary, Adam, Jane, Sall, Istar, Omar, Paul, Ulla, Zackarias, Örnek, Şelale , İstanbul, Ürgüp'.split(', ')
print ','.join(sorted(names))
print ','.join(sorted(names, cmp=locale.strcoll))

It is generaly recommended to use key instead of cmp parameter. Here version using snippet from http://wiki.python.org/moin/HowTo/Sorting for the purpose.

# -*- coding: cp1254 -*-
import locale

def cmp_to_key(mycmp):
    'Convert a cmp= function into a key= function'
    class K(object):
        def __init__(self, obj, *args):
            self.obj = obj
        def __lt__(self, other):
            return mycmp(self.obj, other.obj) < 0
        def __gt__(self, other):
            return mycmp(self.obj, other.obj) > 0
        def __eq__(self, other):
            return mycmp(self.obj, other.obj) == 0
        def __le__(self, other):
            return mycmp(self.obj, other.obj) <= 0
        def __ge__(self, other):
            return mycmp(self.obj, other.obj) >= 0
        def __ne__(self, other):
            return mycmp(self.obj, other.obj) != 0
    return K

print locale.getlocale()
locale.setlocale(locale.LC_ALL, locale='Turkish_Turkey')
print locale.getlocale()

names = 'Mary, Adam, Jane, Sall, Istar, Omar, Paul, Ulla, Zackarias, Örnek, Şelale , İstanbul, Ürgüp'.split(', ')
print ','.join(sorted(names))
print ','.join(sorted(names, key=cmp_to_key(locale.strcoll)))

For merging see: http://www.daniweb.com/software-development/python/code/325235/1389338#post1389338

heapq.merge sort does not have key parameter, though.

thnks for your code
i havent tried yet but in my code i have two merged iterable object...
i suppose it works..

It may slow down things, but you could also define your own string class for turkish words, with a specific comparison function

# -*-coding: utf8-*-
# tested with python 2.6
from random import shuffle
from heapq import merge

class turkish(unicode):
    def __new__(cls, *args):
        return unicode.__new__(cls, *args)
    
    def key(self, n = None):
        m = self.mapping
        return tuple(m[c] for c in self[:len(self) if n is None else n])
    
    def __lt__(self, other):
        m = min(len(self), len(other))
        u, v = self.key(m), other.key(m)
        return u < v or (u == v and len(self) < len(other))

    alphabet = u"ABCÇDEFGĞHIİJKLMNOÖPRSŞTUÜVYZabcçdefgğhıijklmnoöprsştuüvyz"
    mapping = dict((c, i) for i, c in enumerate(alphabet))

def printlist(L):
    print ", ".join(L)
    
if __name__ == "__main__":
    names = [ turkish(n) for n in u'Mary, Adam, Jane, Sall, Istar, Omar, Paul, Ulla, Zackarias, Örnek, Şelale, İstanbul, Ürgüp'.split(', ') ]
    shuffle(names)
    A = sorted(names[:7])
    B = sorted(names[7:])

    printlist(A)
    printlist(B)
    printlist(merge(A, B))
    
""" my output -->
Adam, İstanbul, Mary, Omar, Paul, Şelale, Zackarias
Istar, Jane, Örnek, Sall, Ulla, Ürgüp
Adam, Istar, İstanbul, Jane, Mary, Omar, Örnek, Paul, Sall, Şelale, Ulla, Ürgüp, Zackarias
"""

You may also want to add characters in the alphabet, like space and digits...

hi again
it gave "sorted() got an unexpected keyword argument 'key'" error , why i got this error

Your Python is maybe not upto date, current version for python2 is 2.7.2 and for python3 python 3.2.1

I found that locale actualy has also key function:

# -*- coding: cp1254 -*-
import locale

print(locale.getlocale())
locale.setlocale(locale.LC_ALL, locale='Turkish_Turkey')
print(locale.getlocale())

names = 'Mary, Adam, Jane, Sall, Istar, Omar, Paul, Ulla, Zackarias, Örnek, Şelale , İstanbul, Ürgüp'.split(', ')
print(','.join(sorted(names)))
print(','.join(sorted(names, key=locale.strxfrm)))

Your Python is maybe not upto date, current version for python2 is 2.7.2 and for python3 python 3.2.1

I found that locale actualy has also key function:

# -*- coding: cp1254 -*-
import locale

print(locale.getlocale())
locale.setlocale(locale.LC_ALL, locale='Turkish_Turkey')
print(locale.getlocale())

names = 'Mary, Adam, Jane, Sall, Istar, Omar, Paul, Ulla, Zackarias, Örnek, Şelale , İstanbul, Ürgüp'.split(', ')
print(','.join(sorted(names)))
print(','.join(sorted(names, key=locale.strxfrm)))

My python refuses to set locale to 'Turkish_Turkey' , it looks like a portability issue.

can we use it in iterable objects , i have tried but not works ...
i want to use in django queryset result which is iterable

Post code, working data file (Attach it to post in Advanced Editor) and post exsact error messages.

Can't you just make list from iterable? Like (don't know django, but as principle):

results = sorted(django.queryset(), key=locale.strxfrm)
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.