I am trying to compute some lexical statistics from a given text. For instance, how many lines, sentences, words, how many times a given character is repeated. Can anyone help me with this?

Recommended Answers

All 2 Replies

I am trying to compute some lexical statistics from a given text. For instance, how many lines, sentences, words, how many times a given character is repeated. Can anyone help me with this?

Yes this is easy to do in python. A quick way to get the number of lines would be to read the file using readlines , which stores the contents in a list. Each element of the list is a single line of the file, so in order to get the number of lines, simply read the len of the list.

You can also look into the the code snippet called "Story Statistics" that creates dictionaries (Python containers) of words and characters in a story text. Just amazing what you can do with those dictionaries.

See: http://www.daniweb.com/code/snippet228125.html

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.