Hello everyone, I am currently teaching myself language processing by using the book of NLTK -found at http://www.nltk.org/book - and I have a problem.
The following code retrieves every sentence in Shakespeare's Macbeth respectively as a list of list of list -or something like that- format:
from nltk.corpus import gutenberg mySents = gutenberg.sents('shakespeare-macbeth.txt')
for example, mySents[0:5] results in:
[['[', 'The', 'Tragedie', 'of', 'Macbeth', 'by', 'William', 'Shakespeare', '1603', ']'], ['Actus', 'Primus', '.'], ['Scoena', 'Prima', '.'], ['Thunder', 'and', 'Lightning', '.'], ['Enter', 'three', 'Witches', '.']]
The first 5 sentences of Macbeth are pritten in the stuff up there.
My problem is, I want to turn the sentences of books in project Gutenberg to uppercase so I can perform a search without worrying about case sensitivity. So I should be able to get the output like the following:
[['[', 'THE', 'TRAGEDIE', 'OF', 'MACBETH', 'BY', 'WILLIAM'... and so on
Any help would be appreciated, thanks.
Edited 3 Years Ago by mike_2000_17: Fixed formatting