Hi everybody,

I'm new at Python and this forum has been a great help so far, so I hope someone is willing to help me on the following problem.
It can count the number of sentences and words in a text, but after I have split up a text into sentences, I do not know how to count the words in each sentence separately so I can calculate the standard deviation.

Grts

Recommended Answers

All 3 Replies

You can use a regex for words

import re
word_re = re.compile(r"[A-Za-z]+")

def count_words(sentence):
    return word_re.subn('', sentence)[1]

print(count_words("Give me bacon and eggs, said the other man."))

To extract the sentences of a text into a list, you have to establish some rules. A simple rule may be that all sentences end with one of these characters '.' or '?' or '!'

Now you can extract the text's sentences, for a typical example see:
http://www.daniweb.com/forums/showthread.php?p=1175950#post1175950

Thanks for your help! I have been able to calculate the average word and sentence length and their standard deviation, but now I am stuck on calculating the sd of the type-token ratio :-s

Grts

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.