Counting Phrases?

Question

heyday21c 0 Newbie Poster

14 Years Ago

Hello,

I want to count phrases in a text file by using Python. For instance, if a text is like "I love you very much", I want to make a dictionary like: "I love": 1, "love you": 1, "you very": 1, and "very much": 1. I also want to do it like: "I love you": 1, "love you very": 1, "you very much": 1.

Would you give a good idea (sample codes) to do this?

Thank you!!!

python

3 Contributors
4 Replies
1K Views
6 Days Discussion Span
Latest Post 14 Years Ago Latest Post by vegaseat

All 4 Replies

vegaseat 1,735 DaniWeb's Hypocrite

14 Years Ago

One of the ways to do this would be to use slicing.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 1 · 2010-06-24T23:46:37+00:00

This is loop to generate the phrases as lists. Add counting and put list back to string by your self, and add storage of phrases,input/output according to your requirements.

sentence="I love you very much"
words=sentence.split()
words_in_sentence=len(words)
phrases=[]

for phraselength in range(2,words_in_sentence):
    for startword in range(words_in_sentence-phraselength):
        print words[startword:startword+phraselength]

heyday21c 0 Newbie Poster · Answer 2 · 2010-06-30T02:42:53+00:00

This is loop to generate the phrases as lists. Add counting and put list back to string by your self, and add storage of phrases,input/output according to your requirements.
sentence="I love you very much"
words=sentence.split()
words_in_sentence=len(words)
phrases=[]

for phraselength in range(2,words_in_sentence):
    for startword in range(words_in_sentence-phraselength):
        print words[startword:startword+phraselength]

I have tried to solve it for several days. However, I could not figured it out. I am a novice in programming.

My ultimate purpose is counting phrases in text files. For instance, how many same (similar) phrases are used in a text.

From your codes, I got phrases. However, I have not solved counting them. Would you let me know how to count them and save them?

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 3 · 2010-07-01T02:44:21+00:00

This may not be exactly what you want, but you can develop this rather basic code further to suit your needs ...

# group a text into groups of words
# used Python31 should work with Python26

def group_text(text, group_size):
    """
    groups a text into text groups set by group_size
    returns a list of grouped strings
    """
    word_list = text.split()
    group_list = []
    for k in range(len(word_list)):
        start = k
        end = k + group_size
        group_slice = word_list[start: end]
        # append only groups of proper length/size
        if len(group_slice) == group_size:
            group_list.append(" ".join(group_slice))
    return group_list
        

text = "I love you very much so very much"

group_size = 2
group_list = group_text(text, group_size)
# convert list to set to avoid duplicates
group_set = set(group_list)

print(group_set)

"""result (word_groups are in hash order in the set) >>>
{'very much', 'you very', 'love you', 'so very', 'I love', 'much so'}
"""

# optionally take the word_groups in the set
# and count them in the text
for group in group_set:
    count = text.count(group)
    sf = "'%s' appears %d times in the text"
    print(sf % (group, count))

"""result >>>
'very much' appears 2 times in the text
'you very' appears 1 times in the text
'love you' appears 1 times in the text
'so very' appears 1 times in the text
'I love' appears 1 times in the text
'much so' appears 1 times in the text
"""

Counting Phrases?

Recommended Answers Collapse Answers

All 4 Replies

Recommended Answers