954,557 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Code flow doesn't work ?

Hello everyone, I am trying to process various texts by regex and NLTK of python -which is at http://www.nltk.org/book- . I am trying to create a random text generator and I am having a slight problem. Firstly, here is my code flow:

Step-1)Enter a sentence as input -this is called trigger string, is assigned to a variable-

2)Get longest word in trigger string

3)Search all Project Gutenberg database for sentences that contain this word -regardless of uppercase lowercase-

4)Return the longest sentence that has the word I spoke about in step 3

5)Append the sentence in Step 1 and Step4 together

6)Assign the sentence in step4 as the new 'trigger' sentence and Repeat the process. Note that I have to get the longest word in second sentence and continue like that and so on-

So far, I have been able to do this only once. When I try to keep this to continue, the program only keeps printing the first sentence my search yields. It should actually look for the longest word in this new sentence and keep applying my code flow described above. Below is my code along with a sample input/output :

import nltk
from nltk.corpus import gutenberg
triggerSentence = raw_input("Please enter the trigger sentence: ")#get input str
split_str = triggerSentence.split()#split the sentence into words
longestLength = 0
longestString = ""

montyPython = 1

while montyPython:

    #code to find the longest word in the trigger sentence input
    for piece in split_str:
        if len(piece) > longestLength:
            longestString = piece
            longestLength = len(piece)


    listOfSents = gutenberg.sents() #all sentences of gutenberg are assigned -list of list format-
    
    listOfWords = gutenberg.words()# all words in gutenberg books -list format-
    
    lt = longestString.lower() #this line tells you whether word list has the longest word in a case-insensitive way. 

    longestSentence = max((listOfWords for listOfWords in listOfSents if any(lt == word.lower() for word in listOfWords)), key = len)
    #get longest sentence -list format with every word of sentence being an actual element-

    longestSent=[longestSentence]

    for word in longestSent:#convert the list longestSentence to an actual string
        sstr = " ".join(word)
    print triggerSentence + " "+ sstr
    triggerSentence = sstr


Sample input: "Thane of code"
Sample output:"Thane of code Norway himselfe , with terrible numbers , Assisted by that most disloyall Traytor , The Thane of Cawdor , began a dismall Conflict , Till that Bellona ' s Bridegroome , lapt in proofe , Confronted him with selfe - comparisons , Point against Point , rebellious Arme ' gainst Arme , Curbing his lauish spirit : and to conclude , The Victorie fell on vs"

Now this should actually take the sentence that starts with 'Norway himselfe....' and look for the longest word in it and do the steps above and so on but it doesn't. Any suggestions ? Thanks.

koveras vehcna
Newbie Poster
24 posts since Aug 2010
Reputation Points: 10
Solved Threads: 0
 

Line 10 is a problem: How do you ever quit? It should probably look something like while split_str: (Or: Whatis your halt condition?)

line 33 is also a problem: Your while loop starts after split_str is created the first time, so you need to add line 34: split_str = sstr.split()


I have not paid any attention to whether this code would do what it should other than spotting the obvious problems, so no guarantees.

griswolf
Veteran Poster
1,165 posts since Apr 2010
Reputation Points: 344
Solved Threads: 256
 
longestSentence = max((listOfWords for listOfWords in listOfSents if any(lt == word.lower() for word in listOfWords)), key = len)
    #get longest sentence -list format with every word of sentence being an actual element-
 
    longestSent=[longestSentence]
 
    for word in longestSent:#convert the list longestSentence to an actual string


Break down the list comprehension into something readable. Perhaps splitting on the sentence break, ". ", and sending each sentence to a function which checks for the trigger word, and then returns the length of the sentence if the trigger is found, or zero if not found.

woooee
Nearly a Posting Maven
2,454 posts since Dec 2006
Reputation Points: 777
Solved Threads: 714
 
longestSentence = max((listOfWords for listOfWords in listOfSents if any(lt == word.lower() for word in listOfWords)), key = len)
    #get longest sentence -list format with every word of sentence being an actual element-
 
    longestSent=[longestSentence]
 
    for word in longestSent:#convert the list longestSentence to an actual string
Break down the list comprehension into something readable. Perhaps splitting on the sentence break, ". ", and sending each sentence to a function which checks for the trigger word, and then returns the length of the sentence if the trigger is found, or zero if not found.

That, I will work on. Thanks for the tip.

koveras vehcna
Newbie Poster
24 posts since Aug 2010
Reputation Points: 10
Solved Threads: 0
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: