0

Hi,

I am very new to Python and computer programming language. I have been working on a text file where I want to find the average length of words in a text file. Well, to start with:

Let's say I have only one sentence in my text file (we can worry about the multiple sentences later). Here's the text:

"But Buffet wrote that he remains hopeful about the long-term prospects for his company and the nation despite the turmoil shaking the world's economies."

I can find the length of each word in this sentence by using the following code;

>>> myfile = open("c:/test/oneline.txt","r")
>>> for line in myfile:
... words=line.split()
... wordcounts = len(words)
... for word in words:
... lengthword = len(word)
... print lengthword

which gives me
3
6
5
4
2
7
7
....and so on.

My problem is writing the rest of the code i.e. summing this up and dividing it by the total number of words.

Any help?

Thanks in advance.

3
Contributors
3
Replies
4
Views
8 Years
Discussion Span
Last Post by ararik
0

First of all, please read this. It is on the very beginning of the forum. Your indent cannot be reconstructed from your post, so your code can only be guessed.

I think your problem is not, that you are new to python. In any language there would be a problem with that solution.

How would you say your algo in real words?
You want to divide the sum of wordlentghts with the sum of word pieces.

sum_of_wordlengths=0
sum_of_wordpieces=0
For all line in the file:
   sum_of_wordlengths=sum_of_wordlengths+ length of words in the line
   sum_of_wordpieces=sum_of_wordpieces+ number of words in the line
print sum_of_wordlengths/sum_of_wordpieces

Your code does not sum up the pieces and the lenghts, just stores and prints the last length.

0

ararik, please do not double post!

You are almost there, as slate says, simply sum up word counts and word lenghts:

myfile = open(filename, "r")
wordcount_sum = 0
wordlength_sum = 0
for line in myfile:
    words = line.split()
    # sum up the word counts
    wordcount_sum += len(words)
    for word in words:
        # sum up the word lengths
        wordlength_sum += len(word)

# invoke floating point division for Python versions < 3.0
wordlength_average = wordlength_sum/float(wordcount_sum)
This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.