Hi
I am developing a spell checker for my language, and I came across
solving an interesing problem. It would be great if you can suggest
me an optimized solution for the problem described below:

I have certain group of letters like these:

Group #1: c,r,b
Group #2: a,z,k
Group #3: h,t
.
.

Letters within the same group can be substituted/replaced with the
rest of the letters in that group.

(ie. Group #1 means letter 'c' can be replaced with either 'r' or 'b',
and letter r can be replaced with either c or b, and letter b can be
replaced with either r or c)

A group can have upto 3 letters, and letters do not overlap between
groups. There can be upto 25 such groups.
Given a word containing above letters, (e.g. 'cat'.) I want to
generate all possible words by subsituting letters in the groups.

eg. expected output:

cat rat bat
czt rzt bzt
ckt rkt bkt
cah rah bah
czh rzh bzh
ckh rkh bkh

My code is given below. But for complex words like 'cccccccccczzk' it
thows the 'maximum recursion depth exceeded' error. Therefore, can you
suggest me an optimized solution to achive the above task? Thanks in

``````#-*- coding: UTF-8 -*-
import re
def getPermutations(word):
def getSimilarLetters(letter):
pattern=u"|c,r,b|a,z,k|h,t|"
list=[]
ptrn=re.compile("(?P<grp>\|(.?,)*?"+letter+"(,.)*?\|)")
k=ptrn.search(pattern)
if k:
letterlist=k.group("grp")[1:-1]
list=letterlist.split(",")
list.remove(letter)
return list
else:
return letter

list=[]

def perm(word):
posi=0
wrd=word
for w in word:
posi=posi+1
lst=getSimilarLetters(w)
if len(lst)>0:
for l in lst:
newwrd=wrd[:posi-1]+l+wrd[posi:]
if not (newwrd in list):
list.append(newwrd)
perm(newwrd)

try:
perm(word)
except RuntimeError:
list=[]
list.append("error")
return list

list=getPermutations(u"cat")

for i in list:
print i.encode('utf-8')``````

Here is a possible solution to this problem. I don't know if it's "optimal" (in which sense ?). However, you shouldn't reach a recursion limit, because the recursion depth is proportional to the logarithm of the length of the word.

``````#-*- coding: UTF-8 -*-

class PermutatorException(Exception):
pass

class Permutator(object):

def __init__(self, words):
self.groups = {}
words = [''.join(sorted(set(w))) for w in words]
for w in words:
for c in w:
if c in self.groups:
raise PermutatorException("same letter in 2 different words")
else:
self.groups[c] = w

def permutations(self, word):
n = len(word)
if n <= 1:
if n:
for c in (self.groups[word] if word in self.groups else word):
yield c
else:
yield ''
return
x = len(word)/2
w0, w1 = word[:x], word[x:]
for p0 in self.permutations(w0):
for p1 in self.permutations(w1):
yield p0 + p1

if __name__ == "__main__":
perm = Permutator(["crb", "azk", "ht"])
for word in perm.permutations("cat"):
print word``````

Thank you very much for the prompt reply. Your code works fine with English text. But when I tried to modify it to include my language, it gives following error:

``````Traceback (most recent call last):
perm = Permutator(["සෂශ", "නණ", "ළල"])
raise PermutatorException("same letter in 2 different words")
__main__.PermutatorException: same letter in 2 different words``````

But actually, the same letter is not repeated in the different groups.

The modified code is given below: (The language is Sinhala, and its in Unicode).

``````#-*- coding: UTF-8 -*-

class PermutatorException(Exception):
pass

class Permutator(object):

def __init__(self, words):
self.groups = {}
words = [''.join(sorted(set(w))) for w in words]
for w in words:
for c in w:
if c in self.groups:
raise PermutatorException("same letter in 2 different words")
else:
self.groups[c] = w

def permutations(self, word):
n = len(word)
if n <= 1:
if n:
for c in (self.groups[word] if word in self.groups else word):
yield c
else:
yield ''
return
x = len(word)/2
w0, w1 = word[:x], word[x:]
for p0 in self.permutations(w0):
for p1 in self.permutations(w1):
yield p0 + p1

if __name__ == "__main__":
perm = Permutator(["සෂශ", "නණ", "ළල"])
for word in perm.permutations("ලන"):
print word``````

Here is a possible solution to this problem. I don't know if it's "optimal" (in which sense ?). However, you shouldn't reach a recursion limit, because the recursion depth is proportional to the logarithm of the length of the word.

``````#-*- coding: UTF-8 -*-

class PermutatorException(Exception):
pass

class Permutator(object):

def __init__(self, words):
self.groups = {}
words = [''.join(sorted(set(w))) for w in words]
for w in words:
for c in w:
if c in self.groups:
raise PermutatorException("same letter in 2 different words")
else:
self.groups[c] = w

def permutations(self, word):
n = len(word)
if n <= 1:
if n:
for c in (self.groups[word] if word in self.groups else word):
yield c
else:
yield ''
return
x = len(word)/2
w0, w1 = word[:x], word[x:]
for p0 in self.permutations(w0):
for p1 in self.permutations(w1):
yield p0 + p1

if __name__ == "__main__":
perm = Permutator(["crb", "azk", "ht"])
for word in perm.permutations("cat"):
print word``````

I had the same problem in french, with accented letters, like `"éab"` . It worked when I put the `u` at the beginning of the strings. For example, I wrote `perm = Permutator([u"éab", u"àzk", u"ht"])` . There should be more robust solutions using the methods `encode` and `decode` of the classes `str` and `unicode` .

commented: great help +13

Thanks a million! It works perfectly. here goes the modified code for reference:

``````#-*- coding: UTF-8 -*-

class PermutatorException(Exception):
pass

class Permutator(object):

def __init__(self, words):
self.groups = {}
words = [''.join(sorted(set(w))) for w in words]
for w in words:
for c in w:
if c in self.groups:
raise PermutatorException("same letter in 2 different words")
else:
self.groups[c] = w

def permutations(self, word):
n = len(word)
if n <= 1:
if n:
for c in (self.groups[word] if word in self.groups else word):
yield c
else:
yield ''
return
x = len(word)/2
w0, w1 = word[:x], word[x:]
for p0 in self.permutations(w0):
for p1 in self.permutations(w1):
yield p0 + p1

if __name__ == "__main__":
perm = Permutator([u"සෂශ", u"නණ", u"ළල"])
for word in perm.permutations(u"lලpppනනlලනනlලනලනනලනලනනlllll"):
print word.encode("utf-8")``````

I had the same problem in french, with accented letters, like `"éab"` . It worked when I put the `u` at the beginning of the strings. For example, I wrote `perm = Permutator([u"éab", u"àzk", u"ht"])` . There should be more robust solutions using the methods `encode` and `decode` of the classes `str` and `unicode` .

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.