0

Hello!

I need to:
1. Prompt user for text file
2.Analyze file and graph (with bar plot)the 25 most frequent words with length greater than 4
3.The x axis needs to show the word. The a axis is the frequency.

I've built the bulk of the program so far, but am having some trouble with narrowing it down to the 25 most frequent words with length 4.
Also, my plot is missing a couple of things.

Thanks for your help!

import matplotlib.pyplot as plot

def bar_plot(x_axis,y_axis):
    plot.plot(x_axis,y_axis, marker='o')
    plot.xlabel('Words')
    plot.ylabel('Frequency')
    plot.legend()
    plot.grid()
    plot.show()

    
def main():
    import string
    ofile=open(raw_input("Please enter the name of a text file :"))
    s=ofile.read()
    word_freq={}
    
    word_list=s.split()
    word_list=[s.translate(None, string.punctuation) for s in word_list] 


    for word in word_list:
        count=word_freq.get(word.lower(),0)
        word_freq[word.lower()]=count+1
        
    keys=word_freq.keys()
    keys.sort()
    print "Word ---> Frequency"
    for word in keys:
        bar_plot(word, keys)
        
main()
2
Contributors
1
Reply
3
Views
5 Years
Discussion Span
Last Post by woooee
0

I would suggest that you print word_freq. You possibly want to use setdefault instead of .get although there is no description of what you want this part of the code to do so there is no way to tell for sure.

for word in word_list:
        word_freq.setdefault(word.lower(),0)
        word_freq[word.lower()] += 1

2.Analyze file and graph (with bar plot)the 25 most frequent words with length greater than 4

I would test for word length > 4 and place in a list of lists = [frequency, word], sort in reverse order, and print/plot the first 25.

Edited by woooee: n/a

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.