954,557 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

print the average value of every word

Hello all,

Quick question-i am trying to write a programs whcih calculates the average of the numeric values of words that appear multiple times (if a word appears more than once then i want the average of its values). I do not get an error but instead only the sum of each word's value.This is the input of the program:

index 388.315813
index 311.214286
syndrome 289.708333
factor 184.246753
loss 168.578313
support 153.125000
activity 140.357143
circumference 136.500000
disease 122.529412
tissue 105.931507

So the output will have 9 lines since it will integrate the first two and find their average value. My problem is that my code doesn't work properly. I mean, i do get any errors but it doesn't seem to perform the dividion in order to find the average value.
This is what i ve written so far:

f=open('example.txt','r') 
for line in f:
    words = line.split()
    if len(words) == 2:  
        count = 1
        word = words[0]
        cvalue = float(words[1]) 
        if word not in wordsdict:
            wordsdict[word] = cvalue
        else:
            count += 1
            wordsdict[word] = sum([cvalue,wordsdict[word]]) 
            average = wordsdict[word] / count
    elif len(words) == 3:
         word = words[0] + " " + words [1] 
         cvalue = float(words[2])
         if word not in wordsdict:
             wordsdict[word] = cvalue
         else:
             count  += 1
             wordsdict[word] = sum([cvalue,wordsdict[word]]) 
             average = wordsdict[word] / count


Any thoughts of why this is happening? I may miss something because i am new in python:$

doomas10
Newbie Poster
21 posts since Jul 2010
Reputation Points: 10
Solved Threads: 0
 

Did you check my code which did the job? http://www.daniweb.com/forums/post1270109.html#post1270109

Could you express in words or flow chart the logic of program and find anything missing.

HINT: How are you counting the number of different values and sum of the values for every key separately from each other?

pyTony
pyMod
Moderator
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
 

Since i have posted my query i tried further and i found a way of doing it :-O. I used an extra dictionary to capture the frequency of the words. it is accurate as far as i see from my results.

f=open('example.txt','r') #open a file
for line in f:
    words = line.split()
    if len(words) == 2:   #we have only 2 or three words in the terminological heads file (along with cvalue)
        word = words[0]
        cvalue = float(words[1])  #float because the cvalue is a number
        if word not in wordsdict:
            wordsdict[word] = cvalue
            times[word] = 1
        else:
            wordsdict[word] = sum([cvalue,wordsdict[word]]) 
            times[word] = times[word] + 1
        aver=wordsdict[word]/times[word]
    elif len(words) == 3:
         word = words[0] + " " + words [1] #slightly different here since we have two words
         cvalue = float(words[2])
         if word not in wordsdict:
             wordsdict[word] = cvalue
             times[word] = 1
         else:
             times[word] = 1
             wordsdict[word] = sum([cvalue,wordsdict[word]]) 
             times[word] = times[word] + 1
clist= {}
for word in wordsdict:
    if word in times:
       average = wordsdict[word] / times[word]
       clist = {word:average}


However, i can not sort the clist dictionary by average-which is a number. i tried every possible combination like

alist=sorted(clist.iteritems(), key = lambda (k,v) : (v,k),reverse=True)

or

clist2=[(val,key) for key,val in clist.items()] #this is a list to sort the items with the biggest cvalue first
clist2.sort(reverse=True)


and still the results are completely unsorted. Any thoughts of why this may happenning?:?:

doomas10
Newbie Poster
21 posts since Jul 2010
Reputation Points: 10
Solved Threads: 0
 

The following part of your code doesn't make sense. You should go back to where ever the code came from and double check that it was copied correctly, and that you copied all of the code supplied. Also, print the dictionary to see what it contains.

clist= {}
for word in wordsdict:
    if word in times:
       average = wordsdict[word] / times[word]
       clist = {word:average} [


Edit: This probably doesn't do what you think either. Start by breaking it down into simple steps and then code each step.

if word not in wordsdict:
             wordsdict[word] = cvalue
             times[word] = 1
         else:
             times[word] = 1
             wordsdict[word] = sum([cvalue,wordsdict[word]]) 
             times[word] = times[word] + 1
woooee
Nearly a Posting Maven
2,454 posts since Dec 2006
Reputation Points: 777
Solved Threads: 714
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: