User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Python section within the Software Development category of DaniWeb, a massive community of 402,101 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 2,485 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Python advertiser: Programming Forums

How to sort word (from file) frequancy in decrease order? I need help

Join Date: Mar 2008
Posts: 16
Reputation: alivip is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
alivip alivip is offline Offline
Newbie Poster

Re: How to sort word (from file) frequancy in decrease order? I need help

  #6  
Mar 16th, 2008
this is modify code
# a look at the Tkinter Text widget

# use ctrl+c to copy, ctrl+x to cut selected text,

# ctrl+v to paste, and ctrl+/ to select all
import Tkinter as tk


def most_frequant_word():

      # count words in a text and show the first ten items
    # by decreasing frequency
     
    # sample text for testing

    import sys
    import string
    import re
    v1.set(text1.get(1.0, tk.END))
    text1.delete(1.0, tk.END)
    file = open ("arb.txt", "r")
    text = file.read ( )
    file.close ( )
     
    word_freq = {}
     
    word_list = text.split()
     
    for word in word_list:
        # word all lower case
        word = word.lower()
        # strip any trailing period or comma
        word = word.rstrip('.,/"-_;\[]()')
        # build the dictionary
        count = word_freq.get(word, 0)
        word_freq[word] = count + 1
     
    # create a list of (freq, word) tuples
    freq_list = [(freq, word) for word, freq in word_freq.items()]
     
    # sort the list by the first element in each tuple (default)
    freq_list.sort(reverse=True)
     
    for n, tup in enumerate(freq_list):
        # print the first ten items
        if n < 10:
            text1.insert(tk.INSERT, freq)
            text1.insert(tk.INSERT, word)
            text1.insert(tk.INSERT, "\n")
            freq, word = tup
            print freq, word
root = tk.Tk(className = " most_frequant_word")


# text entry field, width=width chars, height=lines text


text1 = tk.Text(root, width=50, height=20, bg='green')
text1.pack()
# function listed in command will be executed on button click
button1 = tk.Button(root, text='result', command=most_frequant_word)
button1.pack(pady=5)

# define a variable to hold the label text
v1 = tk.StringVar()
# label text will always be the textvariable's value
# width/height in char size
label1 = tk.Label(root, textvariable=v1, width=50, height=20)
label1.pack(pady=5)

# start cursor in text1.
text1.focus()
root.mainloop()

but unfortinatly when I wont to search in (not English text) for example (Arabic) file it will not read it probably it print text like
3ÇáäíÇÈÉ
28Ýí
11Úáì
11ÊÜÊÜãÜÉ
10ãä
10Úä
7Ãä
6ÈÓÈÈ
5ÎÈÑ
5ÇáãÓáãæä

the sample file in attach

I use
 text1.insert(tk.INSERT, freq)
            text1.insert(tk.INSERT, word)
            text1.insert(tk.INSERT, "\n")

to inset to the text
pleas I need your help for this and previous one
Attached Files
File Type: txt arb.txt (7.7 KB, 1 views)
Reply With Quote  
All times are GMT -4. The time now is 2:01 am.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC