Super simple one word anagrams

14 Years Ago TrustyTony 0 1K Views

Here is solution number one for one word anagrams, I gave in discussion thread, response time in my humble Athlon PC under 70 ms.

List of words is from original posting of the discussion thread.
http://www.daniweb.com/forums/post1206616.html#post1206616

This version does not give we'd for input dew.

Next time I give final speed demon solution: preparing lookup table file.

You can prove the program without the length check part it is optional, program becomes around 3 times slower without it in my computer.

python

list.txt (420.13 KB)

The attachment preview is chopped off after the first 10 KB. Please download the entire file.

a
aardvark
abaci
aback
abacus
abaft
abalone
abandon
abandoned
abandonment
abase
abasement
abash
abashed
abashedly
abashment
abate
abatement
abattoir
abbe
abbess
abbey
abbot
abbreviate
abbreviated
abbreviation
abdicate
abdication
abdomen
abdominal
abduct
abduction
abductor
abeam
abed
aberrant
aberration
aberrational
abet
abetter
abettor
abeyance
abhor
abhorrence
abhorrent
abhorrently
abidance
abide
abiding
abidingly
ability
abject
abjection
abjectly
abjectness
abjuration
abjuratory
abjure
abjurer
ablate
ablation
ablative
ablaze
able
able-bodied
abloom
ablution
ablutions
ably
abnegate
abnegation
abnormal
abnormality
abnormally
aboard
abode
abolish
abolition
abolitionism
abolitionist
abominable
abominably
abominate
abomination
aboriginal
aborigine
aborning
abort
abortion
abortionist
abortive
abortively
abound
about
about-face
above
aboveboard
above-mentioned
abracadabra
abrade
abrasion
abrasive
abrasively
abrasiveness
abreast
abridge
abridgement
abridgment
abroad
abrogate
abrogation
abrogator
abrupt
abruptly
abruptness
abs
abscess
abscessed
abscissa
abscissae
abscission
abscond
absconder
absence
absent
absentee
absenteeism
absently
absentminded
absentmindedly
absentmindedness
absinth
absinthe
absolute
absolutely
absoluteness
absolution
absolutism
absolutist
absolve
absorb
absorbed
absorbency
absorbent
absorbing
absorbingly
absorption
absorptive
abstain
abstainer
abstemious
abstemiously
abstemiousness
abstention
abstinence
abstinent
abstract
abstracted
abstractedly
abstractedness
abstraction
abstractly
abstractness
abstruse
abstrusely
abstruseness
absurd
absurdity
absurdly
absurdness
abundance
abundant
abundantly
abuse
abuser
abusive
abusively
abusiveness
abut
abutment
abuzz
abysmal
abysmally
abyss
abyssal
acacia
academe
academia
academic
academically
academician
academics
academy
acanthus
accede
accelerate
acceleration
accelerator
accent
accented
accentual
accentuate
accentuation
accept
acceptability
acceptable
acceptableness
acceptably
acceptance
acceptation
accepted
access
accessibility
accessible
accessibly
accession
accessory
accident
accidental
accidentally
accident-prone
acclaim
acclaimed
acclamation
acclimate
acclimation
acclimatization
acclimatize
acclivity
accolade
accommodate
accommodating
accommodatingly
accommodation
accommodations
accompaniment
accompanist
accompany
accomplice
accomplish
accomplished
accomplishment
accord
accordance
accordant
according
accordingly
accordion
accordionist
accost
account
accountability
accountable
accountancy
accountant
accounting
accounts
accouter
accouterments
accoutre
accoutrements
accredit
accreditation
accredited
accretion
accrual
accrue
acculturate
acculturation
accumulate
accumulation
accumulative
accumulator
accuracy
accurate
accurately
accurateness
accursed
accursedness
accusation
accusative
accusatory
accuse
accused
accuser
accusingly
accustom
accustomed
ace
acerbate
acerbic
acerbically
acerbity
acetaminophen
acetate
acetic
acetone
acetonic
acetylene
ache
achene
achievable
achieve
achievement
achiever
achoo
achromatic
achy
acid
acidic
acidify
acidity
acidly
acidosis
acidulous
acknowledge
acknowledged
acknowledgement
acknowledgment
acme
acne
acolyte
aconite
acorn
acoustic
acoustical
acoustically
acoustics
acquaint
acquaintance
acquaintanceship
acquainted
acquiesce
acquiescence
acquiescent
acquiescently
acquirable
acquire
acquirement
acquisition
acquisitive
acquisitively
acquisitiveness
acquit
acquittal
acre
acreage
acrid
acridity
acridly
acridness
acrimonious
acrimoniously
acrimoniousness
acrimony
acrobat
acrobatic
acrobatically
acrobatics
acronym
acrophobia
acropolis
across
across-the-board
acrostic
acrylic
act
acting
actinium
action
actionable
activate
activation
activator
active
actively
activeness
activism
activist
activity
actor
actress
actual
actuality
actualization
actualize
actually
actuarial
actuary
actuate
actuation
actuator
acuity
acumen
acupressure
acupuncture
acupuncturist
acute
acutely
acuteness
acyclovir
ad
adage
adagio
adamant
adamantly
adapt
adaptability
adaptable
adaptation
adapter
adaptive
adaptor
add
addable
addend
addenda
addendum
adder
addict
addicted
addiction
addictive
addition
additional
additionally
additive
addle
add-on
address
addressee
adduce
adenine
adenoid
adenoidal
adenoids
adept
adeptly
adeptness
adequacy
adequate
adequately
adequateness
adhere
adherence
adherent
adhesion
adhesive
adhesiveness
adieu
adieux
adios
adipose
adjacency
adjacent
adjacently
adjectival
adjectivally
adjective
adjoin
adjoining
adjourn
adjournment
adjudge
adjudicate
adjudication
adjudicative
adjudicator
adjudicatory
adjunct
adjuration
adjure
adjust
adjustable
adjuster
adjustment
adjustor
adjutant
ad-lib
adman
administer
administrate
administration
administrative
administratively
administrator
admirable
admirably
admiral
admiralty
admiration
admire
admirer
admiring
admiringly
admissibility
admissible
admissibly
admission
admissions
admit
admittance
admittedly
admix
admixture
admonish
admonishment
admonition
admonitory
ado
adobe
adolescence
adolescent
adopt
adoptable
adopted
adopter
adoption
adoptive
adorable
adorableness
adorably
adoration
adore
adorer
adoring
adoringly
adorn
adornment
adrenal
adrenalin
adrenaline
adrift
adroit
adroitly
adroitness
adsorb
adsorbent
adsorption
adulate
adulation
adulator
adulatory
adult
adulterant
adulterate
adulteration
adulterer
adulteress
adulterous
adultery
adulthood
adumbrate
adumbration
advance
advanced
advancement
advances
advantage
advantageous
advantageously
advent
adventitious
adventitiously
adventure
adventurer
adventuresome
adventuress
adventurous
adventurously
adventurousness
adverb
adverbial
adverbially
adversarial
adversary
adverse
adversely
adverseness
adversity
advert
advertise
advertisement
advertiser
advertising
advertorial
advice
advisability
advisable
advisably
advise
advised
advisedly
advisement
adviser
advisor
advisory
advocacy
advocate
adz
adze
aegis
aeon
aerate
aeration
aerator
aerial
aerialist
aerially
aerie
aerobatic
aerobatics
aerobic
aerobically
aerobics
aerodynamic
aerodynamically
aerodynamics
aeronautic
aeronautical
aeronautics
aerosol
aerospace
aesthete
aesthetic
aesthetically
aestheticism
aesthetics
afar
affability
affable
affably
affair
affairs
affect
affectation
affected
affectedly
affecting
affectingly
affection
affectionate
affectionately
afferent
affiance
affidavit
affiliate
affiliation
affinity
affirm
affirmation
affirmative
affirmatively
affix
afflatus
afflict
affliction
affluence
affluent
affluently
afford
affordable
afforest
afforestation
affray
affront
afghan
aficionado
afield
afire
aflame
afloat
aflutter
afoot
aforementioned
aforesaid
aforethought
afoul
afraid
afresh
aft
after
afterbirth
afterburner
aftercare
aftereffect
afterglow
afterimage
afterlife
afterlives
aftermarket
aftermath
afternoon
aftershave
aftershock
aftertaste
afterthought
afterward
afterwards
afterword
again
against
agape
agar
agar-agar
agate
agave
age
aged
ageism
ageist
ageless
agelessly
agelessness
agency
agenda
agent
age-old
ageratum
ages
agglomerate
agglomeration
agglutinate
agglutination
aggrandize
aggrandizement
aggravate
aggravated
aggravating
aggravatingly
aggravation
aggregate
aggregation
aggression
aggressive
aggressively
aggressiveness
aggressor
aggrieve
aggrieved
aghast
agile
agilely
agility
aging
agitate
agitated
agitation
agitator
agitprop
agleam
aglitter
aglow
agnostic
agnosticism
ago
agog
agonize
agonized
agonizing
agonizingly
agony
agoraphobia
agoraphobic
agrarian
agrarianism
agree
agreeable
agreeableness
agreeably
agreed
agreement
agribusiness
agricultural
agriculturalist
agriculturally
agriculture
agriculturist
agronomic
agronomist
agronomy
aground
ague
ah
aha
ahead
ahem
ahoy
aid
aide
aide-de-camp
aides-de-camp
aigrette
ail
aileron
ailing
ailment
aim
aimless
aimlessly
aimlessness
ain't
air
airbag
airbase
airborne
airbrush
air-condition
air-conditioned
air-conditioner
air-conditioning
air-cooled
aircraft
airdrop
airfare
airfield
airflow
airfoil
airfreight
airhead
airily
airiness
airing
airless
airlessness
airlift
airline
airliner
airlock
airmail
airman
airplane
airplay
airport
airs
airship
airsick
airsickness
airspace
airstrike
airstrip
airtight
airtime
air-to-air
airwaves
airway
airworthiness
airworthy
airy
aisle
ajar
akimbo
akin
alabaster
alack
alacrity
alarm
alarmed
alarming
alarmingly
alarmist
alas
alb
albacore
albatross
albeit
albinism
albino
album
albumen
albumin
albuminous
alchemist
alchemy
alcohol
alcoholic
alcoholically
alcoholism
alcove
alder
alderman
alderwoman
ale
aleatory
alehouse
alembic
alert
alertly
alertness
alewife
alewives
alfalfa
alfresco
alga
algae
algal
algebra
algebraic
algebraically
algorithm
algorithmic
alias
alibi
alien
alienable
alienate
alienated
alienation
alienist
alight
align
aligner
alignment
alike
aliment
alimentary
alimony
alit
alive
aliveness
aliyah
alkali
alkaline
alkalinity
alkalize
alkaloid
alkyd
all
all-around
allay
all-clear
allegation
allege
alleged
allegedly
allegiance
allegoric
allegorical
allegorically
allegorist
allegory
allegretto
allegro
allele
alleluia
allergen
allergenic
allergic
allergically
allergist
allergy
alleviate
alleviation
alley
alley-oop
alleyway
all-fired
alliance
allied
allies
alligator
all-important
all-inclusive
alliterate
alliteration
alliterative
alliteratively
all-night
all-nighter
allocate
allocation
allot
allotment

## solution 1: fast enough response, simple function, fail fast
from time import clock
def isanaword(k,s):
    """ goes through the letters of second word (s) and returns it
        if first word (k) contains exactly same letters in same number
        """
    ## different length, not anagram, makes function faster (fail fast principle)
    if (len(k) != len(s)) : return ""  ## optional
    for c in s:
        pos = k.find(c)
        if pos==-1: return "" ## letter not contained in first one found
        k = k[:pos]+k[pos+1:] ## drop the letter in found position pos by slicing
    if not k: return s ## condition enables to remove length check
    
if __name__=="__main__":
    print('To quit enter empty line')
    inputword=' '
    while inputword:
        inputword=raw_input('Give word: ')
        if inputword:
            t=clock()
            for wd in [w.rstrip() for w in open('list.txt') if isanaword(w.rstrip(),inputword)]:
                print wd,
            print
            print 'Took %.2f s'%(clock()-t)

TrustyTony 888 ex-Moderator

14 Years Ago

Here little shorter alternative formulation, but still with timing, which could be removed to simplify:

## solution 1: fast enough response, simple function, fail fast
from time import clock
def isanaword(k,s):
    """ goes through the letters of second word (s) and returns it
        if first word (k) contains exactly same letters in same number
        """
    for c in s:
        pos = k.find(c)
        if pos==-1: return "" ## letter not contained in first one found
        k = k[:pos]+k[pos+1:] ## drop the letter in found position pos by slicing
    if not k: return s ## if all letters used, full anagram
    
if __name__=="__main__":
    print('To quit enter empty line')
    inputword=' '
    while inputword:
        inputword=raw_input('Give word: ')
        if inputword:
            t=clock()
            for wd in [w.rstrip()
                       for w in open('list.txt')
                       if (len(w) == len(inputword)+1 ## newline longer
                           and isanaword(w.rstrip(),inputword))]:
                print wd,
            print
            print 'Took %i ms'%((clock()-t)*1000)

Speed also impoves:

To quit enter empty line
Give word: emit
emit item mite time
Took 40 ms
Give word: omit
omit
Took 41 ms
Give word: reset
ester reset steer terse
Took 43 ms
Give word:

Edited 14 Years Ago by TrustyTony because: n/a

TrustyTony 888 ex-Moderator

14 Years Ago

Still faster:

To quit enter empty line
Give word: emit
emit
item
mite
time
Took 30 ms
Give word: omit
omit
Took 30 ms
Give word: reset
ester
reset
steer
terse
Took 33 ms
Give word: puretmoc
computer
Took 37 ms
Give word: purentoc
Took 37 ms
Give word:

## solution 1: fast enough response, simple function, fail fast
from __future__ import print_function
from time import clock

def isanaword(k,s):
    """ goes through the letters of second word (s) and returns it
        if first word (k) contains exactly same letters in same number
        """
    for c in s:
        if c not in k:
            return "" ## letter not contained in first one found
        k=k.replace(c,'',1)
    if not k.rstrip():
        return s ## if all letters used except whitespace in end
    
if __name__=="__main__":
    print('To quit enter empty line')
    inputword=' '
    while inputword:
        inputword=raw_input('Give word: ')
        li=len(inputword)+1
        if inputword:
            t=clock()
            for wd in [w.rstrip()
                       for w in open('list.txt')
                       if (len(w)==li ## newline longer
                           and isanaword(w,inputword))]:
                print(wd)
            print('Took %i ms'%((clock()-t)*1000))

Edited 14 Years Ago by TrustyTony because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

Wouldn't using itertools.permutations be easier?

TrustyTony 888 ex-Moderator

14 Years Ago

Very inefficient. Most of permutations are not valid words.

Key in efficient generation of anagrams is limiting the number of candidates for anagrams to those words that are possible.

For one word case see my lookup table generation version.
http://www.daniweb.com/code/snippet288327.html

I am planning to use my multiword anagram routine of mine commercially, so unfortunately I can not publish that.

This is nice solution for simplicity and memory use. With some tweaking, removing niceties of printing Python 3 support, It could be even made to an one-three liner.

Edited 14 Years Ago by TrustyTony because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

Oh, I see.

I notice that each time the user inputs some text,
the program will open the words file
and go through each word in that file, checking its length
against the input text's length.
Wouldn't it be more efficient if you loaded part or all of the words from the file into memory upon first user input?
(So the program doesn't have to read from the file as much)
Then, subsequent lookups may be faster.

Edited 14 Years Ago by jcao219 because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

Indeed, I just converted part of your code to IronPython, tweaked it with the modification I mentioned above, and ran it.

Look at the attachment.

Big drawback is that the first time a word's length is inputted, there is a ten ms delay.

Ipy Code:

from System.Diagnostics import Stopwatch
from System.IO import File

def isanaword(k,s):
    """ goes through the letters of second word (s) and returns it
        if first word (k) contains exactly same letters in same number
        """
    for c in s:
        if c not in k:
            return "" ## letter not contained in first one found
        k=k.replace(c,'',1)
    if not k.rstrip():
        return s ## if all letters used except whitespace in end

def populate_words_dict(length):
    wordsdata[length] = [w for w in File.ReadAllLines("list.txt")
                    if len(w) == length]
    
if __name__=="__main__":    
    wordsdata = {}
    
    print('To quit enter empty line')
    inputword=' '
    while inputword:
        inputword=raw_input('Give word: ')
        li=len(inputword)
        if inputword:
            sw = Stopwatch()
            sw.Start()
            if li not in wordsdata:
                populate_words_dict(li)            
            for wd in [w.rstrip()
                       for w in wordsdata[li]
                           if isanaword(w,inputword)]:
                print(wd)
            sw.Stop()
            print('Took {0} ms.'.format(sw.ElapsedMilliseconds))

Edited 14 Years Ago by jcao219 because: n/a

TrustyTony 888 ex-Moderator

14 Years Ago

I see, you do length by length saving version of the dictionary as in memory lookup table. Only your program should not run as wordsdata is not declared global in populate_words_dict, so it is local to function. Still it works, this I do not understand, maybe it generates the dict for every time? Looks faster after adding global for me.

Here is run after removing IronPython modules. Looks me not worth the loss in memory use and complexity. (My lookup version though is possible to simplify, even it is possible to order words by the letters function and use binary search instead of loading anything to memory)

The idea of this program is to give simple solution, as the original posted one word anagram did on the fly full dict of the whole word list in memory without saving it for later use. The lookup table version sacrifices space for time (only needing one time per dictionary anagram generation) and is surely faster than doing lookup table without saving. Even there I keep the plain human readable text format, not using pickle, which in my experience produces very big files. That is unacceptable as the programs should run even in mobile phone.

Main thing is that when you take the timing out, user of this program feels that response is immediate. For that response < 100 ms is quick enough.

I did use the lookup table solution to my full anagram program to deal with shortest anagrams (less than 2*minimum word lenghth). Also the method of one letter removing which is from user ihatehippies, helped to speed up my main program (in different function of course).

Unfortunately I found out that getting keys from anawords is for some reason slower than reading from file original word list and saving list of possible words (even it is dropping extra chars like ' and - on fly every time to allow free format of word list).

In full anagram program I only load list of possible words once, including smaller words producible from that list. After that it is not worth to reduce that set of words even all of them are not possible after one or more words have been selected.

I am using the psyco module to speed thigs up. I have not seen need to use IronPython as it is not possible to use psyco there and so it is much slower.

Here is program as I prepared to run with standard python functions:

## solution 1: fast enough response, simple function, fail fast
from __future__ import print_function ## for python 2.6
from time import clock

def isanaword(k,s):
    """ goes through the letters of second word (s) and returns it
        if first word (k) contains exactly same letters in same number
        """
    k=list(k) ## mutable list instead of string
    for c in s:
        if (c not in k):
            return "" ## letter not contained in first one found
        else:
            k.remove(c) ## remove first occurance
    if not k : return s
            
def populate_words_dict(length):
    global wordsdata
    wordsdata[length] = [w.rstrip() for w in open("list.txt")
                    if len(w) == length+1]
    
if __name__=="__main__":    
    wordsdata = {}
    
    print('To quit enter empty line')
    inputword=' '
    while inputword:
        inputword=raw_input('Give word: ')
        li=len(inputword)
        if inputword:
            t=clock()
            if li not in wordsdata:
                populate_words_dict(li)            
            for wd in [w.rstrip()
                       for w in wordsdata[li]
                           if isanaword(w,inputword)]:
                print(wd)
            print()
            print('Took %i ms'%((clock()-t)*1000))

To quit enter empty line
Give word: pucomret
computer
Took 51 ms

Give word: computer
computer
Took 27 ms

Give word: similar
similar
Took 51 ms

Give word: same
mesa
same
seam
Took 32 ms

Give word: item
emit
item
mite
time
Took 9 ms

Give word: smae
mesa
same
seam
Took 9 ms

Give word: quit
quit
Took 8 ms

Give word: eixt
exit
Took 9 ms

Give word:

Edited 14 Years Ago by TrustyTony because: Adding the global

jcao219 18 Posting Pro in Training

14 Years Ago

I see, you do length by length saving version of the dictionary as in memory lookup table. Only your program should not run as wordsdata is not declared global in populate_words_dict, so it is local to function. Still it works, this I do not understand, maybe it generates the dict for every time? Looks faster after adding global for me.

I never use global.
It does not generate the dict every time, whenever python's interpreter is inside a function, if a name is referenced inside, it first checks the local scope, and seeing that wordsdata is not there, it will then check the global scope.
Since wordsdata is mutable, it would just change that one instance.

Also, in some cases IronPython is actually faster than CPython because its compiled to CIL.

I have not seen need to use IronPython as it is not possible to use psyco there and so it is much slower.

I would much rather create a program like this in C# if speed was important because then it generates a single assembly that runs rather quickly.

Edited 14 Years Ago by jcao219 because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

Support for a wildcard * would be good.
Then people can type in happ* and get happy as a result.

Are you working on a GUI for this? For some reasons non-technical users might prefer a GUI.

TrustyTony 888 ex-Moderator

14 Years Ago

I have user interface ideas for Tkinter, Nokia'a own appuifw (only menus now) and also QT. Problem is that I would like to publish it in Symbian phones in OVI store, but neither QT (even it is supposed to be future of Nokia phones) nor Python programs can be put there, because programs can not have external dependencies and not to be too large. So I guess I must use the old interface and known to me part of C++ i.e. use it as C language which I know to use in basic level. Speed is not so much issue as I can so quickly 100 or more anagrams and continue to generate more.

For example, you could easily start to prepare dictionaries even in this simple program in other thread until user has inputted the words, the smallest word length, the dictionary used...

My program is multiword anagram program. You input Jimmy Cao, and English dictionary and length two as minimum word length, and the program spits answer in my lowly Sempron spits answer no anagrams found 0 ms processing (dictionary loading from simple text dictionary 15 ms, 15 fitting words). You have not very anagram friendly name.

My name is my test case, for English it is little slower (see screenshot of plain Tk interface version, number is test of progress indicator value, which updates only at end), in Finnish the program finds 1565 anagrams in total time of 640 ms (67 ms word selection, cleaning and loading from text file).

I have not yet progress bar working, I have one global variable updated now and then from function, but no threading implemented (I do have one after function updating the number once per second, but it seems not update during intensive calculation routine, maybe need 1 ms sleep now and then?)

Edited 14 Years Ago by TrustyTony because: n/a

jcao219 18 Posting Pro in Training

14 Years Ago

I have user interface ideas for Tkinter, Nokia'a own appuifw (only menus now) and also QT. Problem is that I would like to publish it in Symbian phones in OVI store, but neither QT (even it is supposed to be future of Nokia phones) nor Python programs can be put there, because programs can not have external dependencies and not to be too large. So I guess I must use the old interface and known to me part of C++ i.e. use it as C language which I know to use in basic level. Speed is not so much issue as I can so quickly 100 or more anagrams and continue to generate more.
For example, you could easily start to prepare dictionaries even in this simple program in other thread until user has inputted the words, the smallest word length, the dictionary used...
My program is multiword anagram program. You input Jimmy Cao, and English dictionary and length two as minimum word length, and the program spits answer in my lowly Sempron spits answer no anagrams found 0 ms processing (dictionary loading from simple text dictionary 15 ms, 15 fitting words). You have not very anagram friendly name.
My name is my test case, for English it is little slower (see screenshot of plain Tk interface version, number is test of progress indicator value, which updates only at end), in Finnish the program finds 1565 anagrams in total tiem of 640 ms (67 ms word selection, cleaning and loading from text file).
I have not yet progress bar working, I have one global variable updated now and then from function, but no threading implemented (I do have one after function updating the number once per second, but it seems not update during intensive calculation routine, maybe need 1 ms sleep now and then?)

Your program looks good.
Yes, you need threading whenever you have a GUI. That way, the program doesn't seem to freeze while you do calculations.

TrustyTony 888 ex-Moderator

14 Years Ago

I have though plan to change from return to yield and that way I have control back often to main GUI and maybe can persuade tk GUI to show nice progress bar.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.