Count given word

Question

atsuko 0 Newbie Poster

17 Years Ago

Hi there,

I am new to python.
Can somebody tell me how can I count a given word from a file.
I found lots of solution for counting all the words in a file, but not for some particular ones.

Thanks in advance

python

9 Contributors
9 Replies
706 Views
1 Year Discussion Span
Latest Post 15 Years Ago Latest Post by TrustyTony

All 9 Replies

Shadow14l 0 Light Poster

17 Years Ago

string = "hi"

f = open("filename.txt")
contents = f.read()
f.close()

print "Number of '" + string + "' in your file is:", contents.count("hi")

Replace "hi" with the word you want to count basically.

Feel free to ask anymore questions!

Ene Uran 638 Posting Virtuoso

17 Years Ago

There is a problem with count() as shown below:

string = "hi"

# test text
text = "hi, I am a history buff with a hideous hidrosis history"

print "Number of '" + string + "' in your file is:", text.count("hi")

"""
my result -->
Number of 'hi' in your file is: 5
"""

woooee 814 Nearly a Posting Maven

17 Years Ago

You could also replace the following:

reducedList = [n for n in listParagraph if len(n) == len(word)]
## replace with this
reduced_list = [n for n in list_paragraph if word == n.lower()]
if len(reduced_list):
    print "The word", "'" + word + "'", "occurs", len(reduced_list), "times in this section."
else:
    print "The word", "'" + word + "'", "does not occur in this section of text."
#
# anyway, here is another solution
#
paragraph = '''This is a test sentence.  We will look for hi in this sentence.
If we find 'hi', we want to keep count of it.  Remember, hi
can be Hi or hi.  Hi can also have characters before or
after it ie (hi. hi, hi:).  There should be a total of 10 'hi'
in this sentence, not any more for words like 'this' or
'hippo' or 'hiccups' '''

word='hi'
p_list = paragraph.split()
ctr=0
for p_word in p_list:
   p_word = p_word.lower()
   if p_word == word:
      ctr += 1
   elif not p_word.isalpha():     ## has non-alpha characters
      new_word = ""
      for chr in p_word:
         if chr.isalpha():
            new_word += chr
      if new_word == word:
         ctr += 1
print "The word '%s' was found %d times" % (word, ctr)

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

atsuko 0 Newbie Poster · Answer 1 · 2008-06-10T01:27:43+00:00

atsuko 0 Newbie Poster

17 Years Ago

Thank you so much!!!

Dunganb 0 Newbie Poster · Answer 2 · 2008-06-11T06:04:48+00:00

Hi atsuko,

Here is my version of a word finder. It has some serious limitations (it cannot search for words such as "It's" or anything like that due to the punctuation), and it cannot search for multiple words at one time, but at least in my tests it could find the word I was looking for in the correct amount. Here is an example of it working...

paragraph = '''This is a test sentence.  We will look for hi in this sentence.
If we find 'hi', we want to keep count of it.  Remember, hi
can be Hi or hi.  Hi can also have characters before or
after it ie (hi. hi, hi:).  There should be a total of 10 'hi'
in this sentence, not any more for words like 'this' or
'hippo' or 'hiccups' '''

word = 'hi'

for x in range(33,64):
    char = chr(x)
    paragraph = paragraph.replace(char, '')

for x in range(91,97):
    char = chr(x)
    paragraph = paragraph.replace(char, '')

listParagraph = paragraph.split()
reducedList = [n for n in listParagraph if len(n) == len(word)]
reducedParagraph = ' '.join(reducedList)
reducedParagraph = reducedParagraph.lower()

count = reducedParagraph.count(word.lower())
if count == 0:
    print "The word", "'" + word + "'", "does not occur in this section of text."
else:
    print "The word", "'" + word + "'", "occurs", count, "times in this section."

I read an article online about tokenization, but for my current knowledge level I couldn't really do anything with it. From what I read though it is a more complicated but more exact way of finding words.

pherro 0 Newbie Poster · Answer 3 · 2010-05-26T17:13:21+00:00

how can I develop a program in PASCAL that counts the number of words in a paragraph

sneekula 969 Nearly a Posting Maven · Answer 4 · 2010-05-26T18:21:00+00:00

how can I develop a program in PASCAL that counts the number of words in a paragraph

Pascal is a very old style language. I don't think it would come even close to Python's modern syntax and features.

You could try the Delphi/Python forum at DaniWeb.

The closest thing you could use to take advantage of modern language concepts is Python for Delphi:
http://mmm-experts.com/Products.aspx?ProductID=3

snippsat 661 Master Poster · Answer 5 · 2010-05-26T19:12:46+00:00

Some fun.

import re
print 'Word found was found %s times' % (len([re.findall(r'\bhi\b',open('my_file.txt').read())][0]))

Or a more readable version.

import re

Search_word = 'hi'
comp = r'\b%s\b' % Search_word
my_file = open('my_file.txt').read()
find_word = re.findall(comp, my_file)
print 'Word was found %s times' % len(find_word)

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 6 · 2010-05-26T22:22:59+00:00

Maybe could adapt my earlier multimatcher to be more restrictive:

# multiple searches of a string for a substring
# using s.find(sub[ ,start[, end]])
import string

def multis(search,text,start=0):
    while start>-1:
        f=text.find(search,start)
        start=f
        if start>-1:
            if ((text[start-1] not in string.letters) and
                (text[start+len(search)] not in string.letters)):
                yield f
            start+=1

paragraph = '''This is a test sentence.  We will look for hi in this sentence.
If we find 'hi', we want to keep count of it.  Remember, hi
can be Hi or hi.  Hi can also have characters before or
after it ie (hi. hi, hi:).  There should be a total of 10 'hi'
in this sentence, not any more for words like 'this' or
'hippo' or 'hiccups' '''

word = 'hi'
print(paragraph)
print(word)

print("Searching %s:" % word)
for i in multis(word,paragraph):
    w,_,_ = paragraph[i:].partition(' ')
    print( "%s found at index %d: %s" % (word, i, w) )

Output:

This is a test sentence.  We will look for hi in this sentence.
If we find 'hi', we want to keep count of it.  Remember, hi
can be Hi or hi.  Hi can also have characters before or
after it ie (hi. hi, hi:).  There should be a total of 10 'hi'
in this sentence, not any more for words like 'this' or
'hippo' or 'hiccups' 
hi
Searching hi:
hi found at index 43: hi
hi found at index 76: hi',
hi found at index 121: hi
can
hi found at index 137: hi.
hi found at index 193: hi.
hi found at index 197: hi,
hi found at index 201: hi:).
hi found at index 239: hi'
in

Count given word

Recommended Answers Collapse Answers

All 9 Replies

Recommended Answers