Hi, I've been trying to make a really simple word unscrambler, that takes a scrambled word, and then compares it to a txt file filled with words. The problem I'm getting, is that I'm trying to only read the first word of each line, and if it finds the word, then it prints out the entire line.

So, my txt file kinda looks like this:

cart - art car tar
bagel - age bag gel 
bike -

If the user inputs "ikeb", then the program would only look at "cart", "bagel", and then "bike" before printing out "bike - ". Can anybody help me?

11 Years
Discussion Span
Last Post by G-Do

Suppose you have the scrambled string, "urcoppnei".

You want to be able to quickly go from scrambled string to unscrambled. How?

Sort the letters of the string in increasing order. That gives us "ceinoppru".

Now take every word in the dictionary, and sort their letters in increasing order. Put this in a hash, where the sorted form is the key and a list of words is the value. (Use a list, not a single word, because anagrams have the same sorted form. For example, the key "dgino" would have as its value a list containing both "doing" and "dingo".)

Once you do that, look up "ceinoppru" in the hash. You'll probably find that the key "ceinoppru" maps to a list containing one element, "porcupine".

Of course, it would be better to have your dictionary file in the format

aaardkrv aardvark
dgino doing dingo

It wouldn't hurt to sort it by the letter-sorted key.

So, you would unscramble by sorting the letters of the word in ascending order, by looking for this result in the text file (or hash, or map, or whatever).

I don't really understand your file format -- it looks like it's listing words with subwords. Why?

And here's a second edit. What you are asking and your sample input and output seem to be on completely different topics! You leave me baffled...


Sorry, I stated my problem poorly. I can search for the words fine. (I don't sort them into least to greatest, I just check the first character, and then see if it's in the word I'm checking, and if it is, I delete both. Then I go onto the next character, and repeat. If all the characters are deleted, then it's a matching word.)

That's probably not a very good way of doing it, but I was just wondering if there was anyway of reading only the first word of each line in a file. so if I had a .txt that had the word, followed by it's definition (see fig 1)

[B]Fig 1.[/B]

[I]This is just a text document, not the output[/I]

bared - A naked object
beered - A drunk person
bargain - to barter

the only thing the program would check, would be the first words on each line (fig 2)

[B]Fig 2.[/B]

[I]computer looks at only the first words of each line[/I]


and then if it found the word, i'd print out both it's definition, as well as the word. So if the scrambled word was erebed (beered) the program would actually print out

beered - A drunk person

Sorry for the bad explanation earlier, I used a pretty bad example. But, yeah, that's the problem I'm kinda facing, and I was wondering if anyone could help me with that?


So all you need is a list where each element is the first word on each line in the text file? This is a fairly simple regular expression problem:

import sys, re
# Open the file with read-access
f = open("doc/dict.txt", "r")
# Read the file text as one big block
text = f.read()
# Compile a regex pattern to catch words and definitions
p = re.compile("(([\w]+)[^\\n]+)\\n")
# Find all matches, store in l
l = re.findall(p, text)
# Close the file
# The list 'l' is a list of tuples, such that for each
# tuple, the first coordinate is the definition, and
# the second coordinate is the corresponding word
# Break the list down into two separate lists
words = []; definitions = []
for x in l: words.append(x[1]); definitions.append(x[0])
# At this point, the word in words[i] corresponds to the
# definition in definitions[i]
# Print them
print words
print definitions

Read the regex part of the Python tutorial to figure out what (([\w]+)[^\\n]+)\\n means.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.