the project is about a debate, and there are 4 candicates in the debate, and we need to group them into 4 different dict and count the word frequence they said. the debate is line by line, and we should group the sentences by name tage, like PAUL: and then all lines after needs to be his dict until another name tage appears.

I just done with the file open part and Ive been stuck on the grouping for hours

import string
debate_line=open('debate.txt','rU')
big_list=[]
stop_word=open('stopWords.txt','rU')
word_set=set()
word_list=[]

name=['romney:','santorum:','gingrich:','paul:']

for line in debate_line:
    line=line.lower()
    line=line.strip(string.punctuation)
    big_list.append(line.split())

for item in big_list:
    if len(item)<1:
        big_list.remove(item)

for item in big_list:
    
    if item[0]=='paul':
        for word in item:
            if word in paul_dict:
                paul_dict[word]+=1
            else:
                paul_dict[word]=1

the last part just doesnt work at all

Recommended Answers

All 2 Replies

Start with the simple basics; print big_list and make sure it contains what you think it does and is in the form you expect to be in. You can also add a print statement after
for item in big_list:
to print the item, but it is basically the same thing.

Assuming the data is correct (and I'm not sure it is), try something along these lines

name_list=['romney','santorum','gingrich','paul']
 
found=False
for item in big_list:
 
    if item[0] == "paul":
        found=True
    
    ## if another name then stop processing
    elif item[0] in ['romney','santorum','gingrich']:
        found=False

    if found:
        for word in item:
            if word in paul_dict:
                paul_dict[word]+=1
            else:
                paul_dict[word]=1
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.