•
•
•
•
What is DaniWeb IT Discussion Community?
You're currently browsing the Python section within the Software Development category of DaniWeb, a massive community of 373,375 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 3,774 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Python advertiser:
Views: 400 | Replies: 5
![]() |
•
•
Join Date: Mar 2008
Posts: 16
Reputation:
Rep Power: 1
Solved Threads: 0
I wont to find most 10 frequency word of specific file for that I have written this code
but it only get the word and its frequency
sample output
and its read the file again and again. also it did not sort the
output as you can see
how I can sort output in decrease order to be able to stop print after 10 words?
please help me ASAP
import sys
import string
import re
file = open ( "corpora.txt", "r" )
text = file.read ( )
file.close ( )
word_freq ={ }
word_list = string.split ( text )
for word in word_list:
count = word_freq.get ( string.lower ( word ), 0 )
word_freq[string. lower ( word )] = count + 1
keys = word_freq.keys ( )
keys.sort ( )
i=0
while i<10:
for word in keys:
print word, word_freq[word]
i=1+1but it only get the word and its frequency
sample output
•
•
•
•
blanklines 2
blanklines, 1
characters 1
console 1
count 3
blanklines 2
blanklines, 1
characters 1
console 1
count 3
and its read the file again and again. also it did not sort the
output as you can see
how I can sort output in decrease order to be able to stop print after 10 words?
please help me ASAP
There are mistakes like your last line should be i = i + 1. Also string functions are builtin since version 2.2, module re is not needed.
Here is one way to do this with version 2.5
Here is one way to do this with version 2.5
python Syntax (Toggle Plain Text)
# count words in a text and show the first ten items # by decreasing frequency # sample text for testing text = """\ My name is Fred Flintstone and I am a famous TV star. I have as much authority as the Pope, I just don't have as many people who believe it. """ word_freq = {} word_list = text.split() for word in word_list: # word all lower case word = word.lower() # strip any trailing period or comma word = word.rstrip('.,') # build the dictionary count = word_freq.get(word, 0) word_freq[word] = count + 1 # create a list of (freq, word) tuples freq_list = [(freq, word) for word, freq in word_freq.items()] # sort the list by the first element in each tuple (default) freq_list.sort(reverse=True) for n, tup in enumerate(freq_list): # print the first ten items if n < 10: freq, word = tup print freq, word # or #print word, freq """ my output --> 3 i 3 as 2 have 1 who 1 tv 1 the 1 star 1 pope 1 people 1 name """
Last edited by ZZucker : Mar 14th, 2008 at 12:17 pm.
Never argue with idiots, they'll just bring you down to their level and beat you with their experience.
•
•
Join Date: Mar 2008
Posts: 16
Reputation:
Rep Power: 1
Solved Threads: 0
thank you very much it was very helpfull
but is there way to control number of word to be enter by user
like rather than most 10 frequancy word he can enter 11 , 50 or 44 most frequancy word ..etc
and can I remove the marks like (? "" [] ) ..etc ineed only words
and can python bult user interfac (buttun ,text box etc) and how ?
if not how can I ingreat cod to be user interfac (buttun ,text box etc)
but is there way to control number of word to be enter by user
like rather than most 10 frequancy word he can enter 11 , 50 or 44 most frequancy word ..etc
and can I remove the marks like (? "" [] ) ..etc ineed only words
and can python bult user interfac (buttun ,text box etc) and how ?
if not how can I ingreat cod to be user interfac (buttun ,text box etc)
... can I remove the marks like (? "" [] ) .. etc ineed only words
Instead of
word = word.rstrip('.,')
use
word = word.rstrip('.,?"[]()')
... control number of word to be enter by user
Where you now have
if n < 10:
use
if n < select:
where variable select is an integer from the user's input
... can python bult user interfac (buttun ,text box etc) and how ?
Python has a simple GUI toolkit called Tkinter supplied that can do all that for you. You need to study up on that, it's a whole new ball of wax. Here would be a typical example:
Instead of
word = word.rstrip('.,')
use
word = word.rstrip('.,?"[]()')
... control number of word to be enter by user
Where you now have
if n < 10:
use
if n < select:
where variable select is an integer from the user's input
... can python bult user interfac (buttun ,text box etc) and how ?
Python has a simple GUI toolkit called Tkinter supplied that can do all that for you. You need to study up on that, it's a whole new ball of wax. Here would be a typical example:
python Syntax (Toggle Plain Text)
# a look at the Tkinter Text widget # use ctrl+c to copy, ctrl+x to cut selected text, # ctrl+v to paste, and ctrl+/ to select all import Tkinter as tk def get_text(): # get text widget contents between start_index and end_index # start_index = "%d.%d" % (line, column) here "1.0" # line starts with 1 and column with 0 # here end_index = tk.END # set the label text to the typed-in text v1.set(text1.get(1.0, tk.END)) # clear the text text1.delete(1.0, tk.END) text1.insert(tk.INSERT, ' new text') text1.insert(tk.INSERT, '\n and more text') # this sets the window title caption too # without the leading space Text will be text!? root = tk.Tk(className = " Text, Button, Label ...") # text entry field, width=width chars, height=lines text text1 = tk.Text(root, width=50, height=2, bg='yellow') text1.pack() # function listed in command will be executed on button click button1 = tk.Button(root, text='get the text', command=get_text) button1.pack(pady=5) # define a variable to hold the label text v1 = tk.StringVar() # label text will always be the textvariable's value # width/height in char size label1 = tk.Label(root, textvariable=v1, width=50, height=2) label1.pack(pady=5) # do some caculation and format result pi_approx = 355/113.0 str1 = "%.4f" % (pi_approx) # 3.1416 # show result in text widget text1.insert(tk.INSERT, str1) # start cursor in text1 text1.focus() root.mainloop()
Last edited by ZZucker : Mar 16th, 2008 at 12:48 am.
Never argue with idiots, they'll just bring you down to their level and beat you with their experience.
•
•
Join Date: Mar 2008
Posts: 16
Reputation:
Rep Power: 1
Solved Threads: 0
your reply was so helpful
but how can I make an integer from the user's input (select)?
Is python provide search in directory file contain subfile and folder
for example file name is cars and subfile is Toyota,Honda and BMW and Toyota conain folder name camry and corola, honda contain accord and BMW contan folder name X5
Is there way to enter name of parent file(cars) and search in all sub file(Toyota,Honda and BMW)?
but how can I make an integer from the user's input (select)?
Is python provide search in directory file contain subfile and folder
for example file name is cars and subfile is Toyota,Honda and BMW and Toyota conain folder name camry and corola, honda contain accord and BMW contan folder name X5
Is there way to enter name of parent file(cars) and search in all sub file(Toyota,Honda and BMW)?
Last edited by alivip : Mar 16th, 2008 at 5:19 am.
•
•
Join Date: Mar 2008
Posts: 16
Reputation:
Rep Power: 1
Solved Threads: 0
this is modify code
but unfortinatly when I wont to search in (not English text) for example (Arabic) file it will not read it probably it print text like
3ÇáäíÇÈÉ
28Ýí
11Úáì
11ÊÜÊÜãÜÉ
10ãä
10Úä
7Ãä
6ÈÓÈÈ
5ÎÈÑ
5ÇáãÓáãæä
the sample file in attach
I use
to inset to the text
pleas I need your help for this and previous one
# a look at the Tkinter Text widget
# use ctrl+c to copy, ctrl+x to cut selected text,
# ctrl+v to paste, and ctrl+/ to select all
import Tkinter as tk
def most_frequant_word():
# count words in a text and show the first ten items
# by decreasing frequency
# sample text for testing
import sys
import string
import re
v1.set(text1.get(1.0, tk.END))
text1.delete(1.0, tk.END)
file = open ("arb.txt", "r")
text = file.read ( )
file.close ( )
word_freq = {}
word_list = text.split()
for word in word_list:
# word all lower case
word = word.lower()
# strip any trailing period or comma
word = word.rstrip('.,/"-_;\[]()')
# build the dictionary
count = word_freq.get(word, 0)
word_freq[word] = count + 1
# create a list of (freq, word) tuples
freq_list = [(freq, word) for word, freq in word_freq.items()]
# sort the list by the first element in each tuple (default)
freq_list.sort(reverse=True)
for n, tup in enumerate(freq_list):
# print the first ten items
if n < 10:
text1.insert(tk.INSERT, freq)
text1.insert(tk.INSERT, word)
text1.insert(tk.INSERT, "\n")
freq, word = tup
print freq, word
root = tk.Tk(className = " most_frequant_word")
# text entry field, width=width chars, height=lines text
text1 = tk.Text(root, width=50, height=20, bg='green')
text1.pack()
# function listed in command will be executed on button click
button1 = tk.Button(root, text='result', command=most_frequant_word)
button1.pack(pady=5)
# define a variable to hold the label text
v1 = tk.StringVar()
# label text will always be the textvariable's value
# width/height in char size
label1 = tk.Label(root, textvariable=v1, width=50, height=20)
label1.pack(pady=5)
# start cursor in text1.
text1.focus()
root.mainloop()but unfortinatly when I wont to search in (not English text) for example (Arabic) file it will not read it probably it print text like
3ÇáäíÇÈÉ
28Ýí
11Úáì
11ÊÜÊÜãÜÉ
10ãä
10Úä
7Ãä
6ÈÓÈÈ
5ÎÈÑ
5ÇáãÓáãæä
the sample file in attach
I use
text1.insert(tk.INSERT, freq)
text1.insert(tk.INSERT, word)
text1.insert(tk.INSERT, "\n")to inset to the text
pleas I need your help for this and previous one
![]() |
•
•
•
•
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
•
•
•
•
•
•
•
•
DaniWeb Python Marketplace
Other Threads in the Python Forum
- Previous Thread: Calling Shell Commands
- Next Thread: Need help streamlining this code please


Linear Mode