Member Avatar for leegeorg07

this relates to _nestors problem but im trying to expand it on my own
i have this code:

logfile = open("logfile.txt", "r").readlines()
KEYWORDS = ['test', 'text']
counterline = []
counter = 0
for line in logfile:
    for word in line.split():
        counter+=1
        if word in KEYWORDS:
            counterline.append(counter)
            print word
print KEYWORDS
print counterline

but at the moment it outputs this:

test
text
test
['test', 'text']
[1, 2, 4]

which is what i want it to at the moment
but i would like it to output it like this:

test
text
test
{'test':[1,4], 'text':[2]}

how would i go about doing it?
if possible could it be the easiest method to understand? (so that i can understand and learn from it)

Something like this, using an interim list (untested). You could also put this in a function and return the word and counter the first time it is found.

logfile = open("logfile.txt", "r").readlines()
KEYWORDS = ['test', 'text']
counterline = []
counter = 0
for line in logfile:
    junk_list = []
    for word in line.split():
        counter+=1
        if word in KEYWORDS:
            junk_list.append(counter)
            print word
        if len(junk_list):
           counterline.append(word + ":" +junk_list)
print KEYWORDS
for rec in counterline:
   print rec

To get exactly what you want, you would have to reverse your logic and use
for word in KEYWORDS:
that is, find all occurences of each KEYWORD. Now, you will have one entry if the first word is found, and another entry if the fifth word equals the first word, when you get to the fifth word. Hope that makes sense.

But since you mentioned a dictionary, use a dictionary of lists if the word is found, instead of the two lists. Try it yourself, and post back if you can't get it.

Yes it's possible, you may have to consult the Python manual to understand it all:

# index the lines that contain given search words

data = """\
this test finds how many times
certain words like text
appear in the data
remember this is just a test
"""

filename = "logfile.txt"

# create the data file
fout = open(filename, "w")
fout.write(data)
fout.close() 

# read the data file back in as a list of lines
logfile = open(filename, "r").readlines()

KEYWORDS = ['test', 'text']

index_dict = {}
for ix, line in enumerate(logfile):
    for word in line.split():
        if word in KEYWORDS:    
            index_dict.setdefault(word, []).append(ix+1)

print(index_dict)  # {'test': [1, 4], 'text': [2]}

The function enumerate() gives you the line number and works like counter. Also, using the print() function makes this code work with Python25 and Python30.

Member Avatar for leegeorg07

that didn't really do what i wanted but thanks anyway, after remembering that i ad python for dummies that had a whole section on dictionaries i came up with this:

logfile = open("logfile.txt", "r").readlines()
KEYWORDS = ['test', 'text']
counterline = []
kl = {}
counter = 0
for line in logfile:
    for word in line.split():
        counter+=1
        if word in KEYWORDS:
            for kword in KEYWORDS:
                kword = []
            kword.append(word)
            counterline.append(counter)
            print word
            kl[counter] = kword
print KEYWORDS
print counterline
print kl

i thought i would post it for anyone else that wanted to use it.
if anyone can think of other ways of doing this that would result in the dictionary appearing like this:

{'test':[1,4], 'text':[2]}

then please tell me.

Member Avatar for leegeorg07

sorry, i hadnt seen that until i posted my code, thanks, that should do it :) .

Use one dictionary of lists for everything. Simplified for easier understanding =

KEYWORDS_dic = {}
KEYWORDS_dic['test'] = []
KEYWORDS_dic['text'] = []
word = orig_word.lower()
if word in KEYWORDS_dic:
   KEYWORDS_dic[word].append(number)
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.