954,557 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Reading from a file question

Hi, I've been trying to make a really simple word unscrambler, that takes a scrambled word, and then compares it to a txt file filled with words. The problem I'm getting, is that I'm trying to only read the first word of each line, and if it finds the word, then it prints out the entire line.

So, my txt file kinda looks like this:

cart - art car tar
bagel - age bag gel 
bike -

If the user inputs "ikeb", then the program would only look at "cart", "bagel", and then "bike" before printing out "bike - ". Can anybody help me?

Rete
Newbie Poster
23 posts since Jun 2005
Reputation Points: 10
Solved Threads: 1
 

Suppose you have the scrambled string, "urcoppnei".

You want to be able to quickly go from scrambled string to unscrambled. How?

Sort the letters of the string in increasing order. That gives us "ceinoppru".

Now take every word in the dictionary, and sort their letters in increasing order. Put this in a hash, where the sorted form is the key and a list of words is the value. (Use a list, not a single word, because anagrams have the same sorted form. For example, the key "dgino" would have as its value a list containing both "doing" and "dingo".)

Once you do that, look up "ceinoppru" in the hash. You'll probably find that the key "ceinoppru" maps to a list containing one element, "porcupine".

Of course, it would be better to have your dictionary file in the format

...
aaardkrv aardvark
...
dgino doing dingo
...


It wouldn't hurt to sort it by the letter-sorted key.

So, you would unscramble by sorting the letters of the word in ascending order, by looking for this result in the text file (or hash, or map, or whatever).

I don't really understand your file format -- it looks like it's listing words with subwords. Why?

And here's a second edit. What you are asking and your sample input and output seem to be on completely different topics! You leave me baffled...

Rashakil Fol
Super Senior Demiposter
Team Colleague
2,658 posts since Jun 2005
Reputation Points: 1,135
Solved Threads: 177
 

Sorry, I stated my problem poorly. I can search for the words fine. (I don't sort them into least to greatest, I just check the first character, and then see if it's in the word I'm checking, and if it is, I delete both. Then I go onto the next character, and repeat. If all the characters are deleted, then it's a matching word.)

That's probably not a very good way of doing it, but I was just wondering if there was anyway of reading only the first word of each line in a file. so if I had a .txt that had the word, followed by it's definition (see fig 1)

<strong>Fig 1.</strong>

<em>This is just a text document, not the output</em>

bared - A naked object
beered - A drunk person
bargain - to barter


the only thing the program would check, would be the first words on each line (fig 2)

<strong>Fig 2.</strong>

<em>computer looks at only the first words of each line</em>

bared
beered
bargain


and then if it found the word, i'd print out both it's definition, as well as the word. So if the scrambled word was erebed (beered) the program would actually print outbeered - A drunk person

Sorry for the bad explanation earlier, I used a pretty bad example. But, yeah, that's the problem I'm kinda facing, and I was wondering if anyone could help me with that?

Rete
Newbie Poster
23 posts since Jun 2005
Reputation Points: 10
Solved Threads: 1
 

So all you need is a list where each element is the first word on each line in the text file? This is a fairly simple regular expression problem:

import sys, re
# Open the file with read-access
f = open("doc/dict.txt", "r")
# Read the file text as one big block
text = f.read()
# Compile a regex pattern to catch words and definitions
p = re.compile("(([\w]+)[^\\n]+)\\n")
# Find all matches, store in l
l = re.findall(p, text)
# Close the file
f.close()
# The list 'l' is a list of tuples, such that for each
# tuple, the first coordinate is the definition, and
# the second coordinate is the corresponding word
# Break the list down into two separate lists
words = []; definitions = []
for x in l: words.append(x[1]); definitions.append(x[0])
# At this point, the word in words[i] corresponds to the
# definition in definitions[i]
# Print them
print words
print definitions


Read the regex part of the Python tutorial to figure out what (([\w]+)[^\\n]+)\\n means.

G-Do
Junior Poster
147 posts since Jun 2005
Reputation Points: 41
Solved Threads: 31
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You