Hi,

I am very new to Python and computer programming language. I have been working on a text file where I want to find the average length of words in a text file. Well, to start with:

Let's say I have only one sentence in my text file (we can worry about the multiple sentences later). Here's the text:

"But Buffet wrote that he remains hopeful about the long-term prospects for his company and the nation despite the turmoil shaking the world's economies."

I can find the length of each word in this sentence by using the following code;

>>> myfile = open("c:/test/oneline.txt","r")
>>> for line in myfile:
... words=line.split()
... wordcounts = len(words)
... for word in words:
... lengthword = len(word)
... print lengthword

which gives me
3
6
5
4
2
7
7
....and so on.

My problem is writing the rest of the code i.e. summing this up and dividing it by the total number of words.

Any help?

Thanks in advance.

Recommended Answers

All 3 Replies

First of all, please read this. It is on the very beginning of the forum. Your indent cannot be reconstructed from your post, so your code can only be guessed.

I think your problem is not, that you are new to python. In any language there would be a problem with that solution.

How would you say your algo in real words?
You want to divide the sum of wordlentghts with the sum of word pieces.

sum_of_wordlengths=0
sum_of_wordpieces=0
For all line in the file:
   sum_of_wordlengths=sum_of_wordlengths+ length of words in the line
   sum_of_wordpieces=sum_of_wordpieces+ number of words in the line
print sum_of_wordlengths/sum_of_wordpieces

Your code does not sum up the pieces and the lenghts, just stores and prints the last length.

ararik, please do not double post!

You are almost there, as slate says, simply sum up word counts and word lenghts:

myfile = open(filename, "r")
wordcount_sum = 0
wordlength_sum = 0
for line in myfile:
    words = line.split()
    # sum up the word counts
    wordcount_sum += len(words)
    for word in words:
        # sum up the word lengths
        wordlength_sum += len(word)

# invoke floating point division for Python versions < 3.0
wordlength_average = wordlength_sum/float(wordcount_sum)

Thank you slate and sneekula!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.