| | |
Count word frequency using dictionaries
Thread Solved |
•
•
Join Date: Nov 2009
Posts: 10
Reputation:
Solved Threads: 0
Python Syntax (Toggle Plain Text)
from string import * def removePunctuation(sentence): sentence = lower(sentence) new_sentence = "" for char in sentence: if char not in punctuation: new_sentence = new_sentence + char return new_sentence def wordFrequences(sentence): wordCounts = {} split_sentence = new_sentence.split() print split_sentence for entry in split_sentence: for word in entry: wordCounts[entry] = wordCounts.get (entry,0) + 1 wordCounts.items() return wordCounts sentence = "This is a test sentence, to test the function." new_sentence = removePunctuation(sentence) wordFrequences(sentence)
Hi I am trying to write a program which calculates how many times a certain word appears in a string.
Could someone help me how to do this, i.e. is there something similar instead of using .get which counts the characters.
At the moment i get the output of:
{'a': 1, 'function': 8, 'sentence': 8, 'this': 4, 'is': 2, 'to': 2, 'test': 8, 'the': 3}
I am trying to get the following:
{'this': 1, 'a': 1, 'is': 1, 'test': 2, ...}
Last edited by axa121; Jan 29th, 2010 at 2:33 pm.
•
•
Join Date: Oct 2009
Posts: 136
Reputation:
Solved Threads: 14
0
#3 Jan 29th, 2010
If you find a solution to one of your own problems, it is considered polite to post the solution regardless because other people might also have the same problem.
As for my solution to the problem statement?
I'd probably just strip the string of any non-alphabetic characters excepting spaces and newlines, replace all newlines with spaces, split the resulting string around spaces, iterate over the resulting sequence, and add the word to the dictionary if it is not present with a count of one or increment the counter for the word. (Using the word as the dictionary key. First check that the dictionary has the key for the word, if not then add the key as one or if it does increment the key.)
As for my solution to the problem statement?
I'd probably just strip the string of any non-alphabetic characters excepting spaces and newlines, replace all newlines with spaces, split the resulting string around spaces, iterate over the resulting sequence, and add the word to the dictionary if it is not present with a count of one or increment the counter for the word. (Using the word as the dictionary key. First check that the dictionary has the key for the word, if not then add the key as one or if it does increment the key.)
•
•
Join Date: Oct 2009
Posts: 136
Reputation:
Solved Threads: 14
0
#4 Jan 29th, 2010
Also, it is considered bad code to use the '+' operator to concatenate strings. Strings are immutable objects, so appending a string to another string creates a new string object which takes time and memory.
It is better to store each piece in a list until a concatenated string is needed, then join each piece using a string's "join" method on the list.
It is better to store each piece in a list until a concatenated string is needed, then join each piece using a string's "join" method on the list.
Python Syntax (Toggle Plain Text)
string1 = 'This' string2 = 'is' string3 = 'worse.' final_string = string1 + ' ' + string2 + ' ' + string3 """This is worse.""" mylist = ['This', 'is', 'better.'] better_string = ' '.join(mylist) """This is better."""
•
•
Join Date: Nov 2009
Posts: 10
Reputation:
Solved Threads: 0
0
#5 Jan 30th, 2010
Python Syntax (Toggle Plain Text)
from string import * def removePunctuation(sentence): sentence = lower(sentence) new_sentence = "" for char in sentence: if char not in punctuation: new_sentence = new_sentence + char return new_sentence def wordFrequences(sentence): wordFreq = {} split_sentence = new_sentence.split() for word in split_sentence: wordFreq[word] = wordFreq.get(word,0) + 1 wordFreq.items() print wordFreq sentence = "The first test of the function" new_sentence = removePunctuation(sentence) wordFrequences(sentence)
Here is the corrected version.
•
•
Join Date: Dec 2006
Posts: 1,197
Reputation:
Solved Threads: 341
0
#6 Jan 30th, 2010
I think you want to use new_sentence (and is one of the positive results of posting code).
Python Syntax (Toggle Plain Text)
def wordFrequences(new_sentence): wordFreq = {} ## new_sentence was not defined split_sentence = new_sentence.split() for word in split_sentence: wordFreq[word] = wordFreq.get(word,0) + 1 wordFreq.items() print wordFreq sentence = "The first test of the function" new_sentence = removePunctuation(sentence) wordFrequences(new_sentence)
Last edited by woooee; Jan 30th, 2010 at 12:37 pm.
Linux counter #99383
![]() |
Similar Threads
- Word Frequency Count (Java)
- Code Snippet: Word Frequency using Python (Python)
- Word Frequency Sort Duplication Problem (C++)
- word count (C++)
- Count given word (Python)
- need help on my code about word frequency counter (C++)
- Word Frequency Counter Help (Java)
- word count in borland c++ ?? (C++)
Other Threads in the Python Forum
- Previous Thread: wxpython question
- Next Thread: Help noob with Skype4Py in Python
Views: 403 | Replies: 5
| Thread Tools | Search this Thread |
Tag cloud for Python
application array beginner c++ c/c++ change character class client code command convert count create csv ctypes database dictionary django dll error examples excel exe extensions fdlib file float format framework ftp function graphics gui gzip homework image images import input library line linux list listbox lists logging loop loops microcontroller mouse mysql mysqldb number numbers output parse parsing path port prime processing program programming py2exe pygame pygtk pyqt python random raw_input recursion recursive redirect remote scripting scrolledtext server socket ssh stdout string strings syntax table terminal text thread threading tkinter transparency tuple tutorial ubuntu unicode variable variables web windows wxpython






