•
•
•
•
What is DaniWeb IT Discussion Community?
You're currently browsing the Python section within the Software Development category of DaniWeb, a massive community of 428,379 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 3,578 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Python advertiser: Programming Forums
Views: 397 | Replies: 5
![]() |
•
•
Join Date: May 2008
Posts: 35
Reputation:
Rep Power: 1
Solved Threads: 1
python Syntax (Toggle Plain Text)
string = "hi" f = open("filename.txt") contents = f.read() f.close() print "Number of '" + string + "' in your file is:", contents.count("hi")
Replace "hi" with the word you want to count basically.
Feel free to ask anymore questions!
Last edited by Shadow14l : Jun 9th, 2008 at 3:20 pm.
There is a problem with count() as shown below:
python Syntax (Toggle Plain Text)
string = "hi" # test text text = "hi, I am a history buff with a hideous hidrosis history" print "Number of '" + string + "' in your file is:", text.count("hi") """ my result --> Number of 'hi' in your file is: 5 """
Last edited by Ene Uran : Jun 9th, 2008 at 4:28 pm.
drink her pretty
•
•
Join Date: Jun 2008
Posts: 8
Reputation:
Rep Power: 0
Solved Threads: 2
Hi atsuko,
Here is my version of a word finder. It has some serious limitations (it cannot search for words such as "It's" or anything like that due to the punctuation), and it cannot search for multiple words at one time, but at least in my tests it could find the word I was looking for in the correct amount. Here is an example of it working...
I read an article online about tokenization, but for my current knowledge level I couldn't really do anything with it. From what I read though it is a more complicated but more exact way of finding words.
Here is my version of a word finder. It has some serious limitations (it cannot search for words such as "It's" or anything like that due to the punctuation), and it cannot search for multiple words at one time, but at least in my tests it could find the word I was looking for in the correct amount. Here is an example of it working...
python Syntax (Toggle Plain Text)
paragraph = '''This is a test sentence. We will look for hi in this sentence. If we find 'hi', we want to keep count of it. Remember, hi can be Hi or hi. Hi can also have characters before or after it ie (hi. hi, hi:). There should be a total of 10 'hi' in this sentence, not any more for words like 'this' or 'hippo' or 'hiccups' ''' word = 'hi' for x in range(33,64): char = chr(x) paragraph = paragraph.replace(char, '') for x in range(91,97): char = chr(x) paragraph = paragraph.replace(char, '') listParagraph = paragraph.split() reducedList = [n for n in listParagraph if len(n) == len(word)] reducedParagraph = ' '.join(reducedList) reducedParagraph = reducedParagraph.lower() count = reducedParagraph.count(word.lower()) if count == 0: print "The word", "'" + word + "'", "does not occur in this section of text." else: print "The word", "'" + word + "'", "occurs", count, "times in this section."
I read an article online about tokenization, but for my current knowledge level I couldn't really do anything with it. From what I read though it is a more complicated but more exact way of finding words.
Last edited by Dunganb : Jun 10th, 2008 at 8:16 pm. Reason: Made the code a little nicer...
•
•
Join Date: Dec 2006
Posts: 450
Reputation:
Rep Power: 2
Solved Threads: 62
You could also replace the following:
reducedList = [n for n in listParagraph if len(n) == len(word)]
## replace with this
reduced_list = [n for n in list_paragraph if word == n.lower()]
if len(reduced_list):
print "The word", "'" + word + "'", "occurs", len(reduced_list), "times in this section."
else:
print "The word", "'" + word + "'", "does not occur in this section of text."
#
# anyway, here is another solution
#
paragraph = '''This is a test sentence. We will look for hi in this sentence.
If we find 'hi', we want to keep count of it. Remember, hi
can be Hi or hi. Hi can also have characters before or
after it ie (hi. hi, hi:). There should be a total of 10 'hi'
in this sentence, not any more for words like 'this' or
'hippo' or 'hiccups' '''
word='hi'
p_list = paragraph.split()
ctr=0
for p_word in p_list:
p_word = p_word.lower()
if p_word == word:
ctr += 1
elif not p_word.isalpha(): ## has non-alpha characters
new_word = ""
for chr in p_word:
if chr.isalpha():
new_word += chr
if new_word == word:
ctr += 1
print "The word '%s' was found %d times" % (word, ctr) Last edited by woooee : Jun 10th, 2008 at 10:41 pm.
![]() |
•
•
•
•
•
•
•
•
DaniWeb Python Marketplace
•
•
•
•
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
Similar Threads
- count characters in a string (Perl)
- Count number of searches with with database/flat file (MySQL)
- Word break and line break in Rich Edit Control (C++)
- Search and count for string (C)
- Reading pascal file and searching for a particular word in that file (Pascal and Delphi)
- word count in borland c++ ?? (C++)
- word counter (C++)
Other Threads in the Python Forum
- Previous Thread: Raw UCS2 data to unicode object
- Next Thread: RE:Difference Between Python 2.3 and Python 2.4



Linear Mode