I have 2 text files:

cities.txt
San Francisco
Los Angeles
Seattle
Dallas

master.txt
Atlanta is chill and laid-back.
I love Los Angeles.
Coming to Dallas was the right choice.
New York is so busy!
San Francisco is fun.
Moving to Boston soon!
Go to Seattle in the summer.

output.txt
<main><beg>I love</beg><key>Los Angeles</key><end></end></main>
<main><beg>Coming to</beg><key>Dallas</key><end>was the right choice</end></main>
<main><beg></beg><key>San Francisco</key><end>is fun</end></main>
<main><beg>Go to</beg><key>Seattle</key><end>in the summer</end></main>

Each entity in cities.txt is the <key>. The master.txt file is much longer, and all lines without the city in cities.txt should just be ignored. They're not in order. The output prints out the cities in <key> and <beg> & <end> context (if any).

This is what I have:

with open(master.txt) as f:
    master = f.read()

working = []
with open(cities.txt) as f:
    for i in (word.strip() for word in f):
        if i in data:
                print "<key>", i, "</key>"

I can find the cities in the master file and print them out to a new file, but I'm stuck at how I can also print out the beginning and end contexts for the lines with the cities!

For starters, print the result/"word" for

for word in f:

Also, "data" has not been declared

   if i in data:

Read cities.txt and place the cities in a list or dictionary. Print this to make sure there is no white-space, etc. Then search for each of the cities in each of the records in master.txt (you will and to split each record into words). Post that code and ask for more help. Lists and File readlines()

Edited 3 Years Ago by woooee

Hi woooee, appreciate the response.

with open(master.txt) as f:
    master = f.read()
working = []
with open(cities.txt) as f:
    for i in (word.strip() for word in f):
        if i in master:
                print "<key>", i, "</key>"

split_master = master.split()
print split_master

You mentioned to split each record in master.txt into words, which I did with "master.split()" but this produces a list of one word per element, and loses the context information from the original line. How would I be able to trace back the city in the original context line?

This article has been dead for over six months. Start a new discussion instead.