Hello everyone. This is my first post so if something doesn't work, bear with me. I'm a beginner at python and tried to write a program that will calculate Average Word Length. I wrote what I thought should work but the final answer keeps coming out to an average of one no matter what I write. Help someone?

def main():
    print"This program will calculate average word length!"
    import string
    s = raw_input ("Enter a phrase: ")
    words = s.split()
    wordCount = len(words)
    for word in words:
        ch= len(word)
    avg = ch / wordCount
    if avg is 1:
        print "In the sentence", s , ", the average word length is", avg, "letter."
        print "In the sentence", s , ", the average word length is", avg, "letters."

9 Years
Discussion Span
Last Post by forsakenedzero

You are using 'avg is 1', this is like saying 'avg is True' which is True as long avg is not zero, so you will get the first statement. You have to use 'avg == 1' or better 'avg > 1' ...

def main():
    print"This program will calculate average word length!"
    #s = raw_input ("Enter a phrase: ")
    # use temporary test string instead of input
    s = "this is a test string"
    words = s.split()
    wordCount = len(words)
    # start with zero characters
    ch = 0
    for word in words:
        # add up characters
        ch += len(word)  
    avg = ch / wordCount
    if avg == 1:
        print "In the sentence ", s , ", the average word length is", avg, "letter."
        print "In the sentence ", s , ", the average word length is", avg, "letters."
result -->
This program will calculate average word length!
In the sentence  this is a test string , the average word length is 3 letters.

I'm confused. I agree with the change; == is almost always better than is. Still and all...

>>> a = 2
>>> a is 1
>>> a is 2
>>> if a is 1:
    print "a is 1!"
    print "a is not 1!"

a is not 1!

This seems at odds with your analysis. Or am I missing something?



Been there, done that. :)

The reason Vega's code works and the original did not comes down to this line:

# original code
for word in words:
        ch= len(word)  <<<---

I think, RoadPh, that you wanted to add up the total lengths of words. But because you don't actually do any adding, the effect is just that ch gets clobbered every time with the len(next word). By the time you exit the loop, ch is just the length of the last word.

Hope it helps,



I am very new to Python and computer programming language. I have been working on a text file where I want to find the average length of words in a text file. Well, to start with:

Let's say I have only one sentence in my text file (we can worry about the multiple sentences later). Here's the text:

"But Buffet wrote that he remains hopeful about the long-term prospects for his company and the nation despite the turmoil shaking the world's economies."

I can find the length of each word in this sentence by using the following code;

>>> myfile = open("c:/test/oneline.txt","r")
>>> for line in myfile:
... words=line.split()
... wordcounts = len(words)
... for word in words:
... lengthword = len(word)
... print lengthword

which gives me
....and so on.

My problem is writing the rest of the code i.e. summing this up and dividing it by the total number of words.

Any help?

Thanks in advance.

This seems at odds with your analysis. Or am I missing something?

Python uses a list to store small numbers as that is more effecient so all small numbers point to the same object in the list which is why you get a True for a=1. The "is" operator though is for testing if it is the same object, and not equality. So
print a is 1
prints True because they both point to the same object. However, if you use a larger number, which is system dependent, but can be as small as 257, you will get a False for the same code. The point here is to use == for equality, as "is" will not give consisitent results. In fact, on my system I get different results with the same code run from the interactive prompt vs. running it in a .py file. From here http://docs.python.org/c-api/int.html

"The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object."

And for those who like to read http://svn.python.org/projects/python/trunk/Objects/intobject.c

"Integers are quite normal objects, to make object handling uniform.
(Using odd pointers to represent integers would save much space
but require extra checks for this special case throughout the code.)
Since a typical Python program spends much of its time allocating
and deallocating integers, these operations should be very fast.
Therefore we use a dedicated allocation scheme with a much lower
overhead (in space and time) than straight malloc(): a simple
dedicated free list, filled when necessary with memory from malloc()."


woooee, you only missed this:
ararik pulled up an old thread to attach a homework question.

I already told him not to double post.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.