tuples exercise, most frequent letters

Question

vlady 0 Junior Poster in Training

13 Years Ago

Hello,

I do the following exercise:
Write a function called most_frequent that takes a string and prints the letters in decreasing order of frequency. Find text samples from several different languages and see how letter frequency varies between languages. Compare your results with the tables at wikipedia.org/wiki/Letter_frequencies.
I try my best but I am not sure if it corresponds to a required result thought there is not answer provided (in order to check it and compare if my method it´s not very different (deviated) from desired one)
The result might be like this (below) and if do so, I don´t know how to remove duplicate in "tuples". Please can you help me?
[(2, 'o'), (2, 'n'), (2, 'i'), (2, 'a'),(1, 'v'), (1, 's'), (1, 'p'),...]
My result is like this:
but we can´t see the frequencies...

What would be the most appropriate result?

Thank you very much!

Vlady

def most_frequent(s):
    t=s.split()
    delimiter= ''
    s=delimiter.join(t)
    l=list(s) #['j', 'a', 'n', 'k', 'o', 's', 'i', 'e', 'n', 'a', 'p', 'i', 'v', 'o']
    f=[]
    for i in l:
        f.append(l.count(i)) # [1, 2, 2, 1, 2, 1, 2, 1, 2, 2, 1, 2, 1, 2]
    tup=zip(f,l)
    tup.sort(reverse=True)
    res=[]
    for freq,letter in tup:
        if letter not in res:
            res.append(letter)
    print res # ['o', 'n', 'i', 'a', 'v', 's', 'p', 'k', 'j', 'e']


def main():
    s='janko sie na pivo'
    most_frequent(s)

if __name__ == '__main__':
    main()

python

5 Contributors
6 Replies
999 Views
2 Days Discussion Span
Latest Post 13 Years Ago Latest Post by vlady

vegaseat 1,735 DaniWeb's Hypocrite

13 Years Ago

The nice thing about using a list of (char_frequency, char) tuples is that they also sort the characters that have matching frequencies ...

# create a list of (char_frequency, char) tuples

import pprint

text = "supercalifragilisticexpialidocious"

# create a character list of the text
ch_list = list(text)

# create a list of (letter_freq, letter) tuples
# set(ch_list) creates a set of unique characters
# c.isalpha() is True for letters only
ltc = [(ch_list.count(c), c) for c in set(ch_list) if c.isalpha()]

# sort by increasing frequency
# also sorts the letters with matching frequencies
ltc.sort()

# pretty print the result
pprint.pprint(ltc)

''' my result ...
[(1, 'd'),
 (1, 'f'),
 (1, 'g'),
 (1, 't'),
 (1, 'x'),
 (2, 'e'),
 (2, 'o'),
 (2, 'p'),
 (2, 'r'),
 (2, 'u'),
 (3, 'a'),
 (3, 'c'),
 (3, 'l'),
 (3, 's'),
 (7, 'i')]
'''

vlady commented: difficult one! :-) +3

snippsat 661 Master Poster

13 Years Ago

Just a note that can help.
New from python 2.7 is collections.Counter

>>> from collections import Counter
>>> l = ['j', 'a', 'n', 'k', 'o', 's', 'i', 'e', 'n', 'a', 'p', 'i', 'v', 'o']
>>> Counter(text)
Counter({'a': 2, 'i': 2, 'o': 2, 'n': 2, 'e': 1, 'k': 1, 'j': 1, 'p': 1, 's': 1, 'v': 1})

It also has a most_common feature.

>>> Counter(text).most_common(5)
[('a', 2), ('i', 2), ('o', 2), ('n', 2), ('e', 1)]
>>>

Edited 13 Years Ago by snippsat because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 1 · 2011-06-20T21:35:29+00:00

I would include the letter in the tuple appended to list at line 8 or do loop over count, letter tuples of frequency and letter prepared by the zip function. I would not definately use l as variable name, use full, understandable words.

woooee 814 Nearly a Posting Maven · Answer 2 · 2011-06-20T21:58:10+00:00

Why are you going through the following gyrations instead of using "t". Also, please do not use "i", "l", or "O" as variable names as they can look like letters.

t=s.split()
delimiter= ''
s=delimiter.join(t)
l=list(s)

woooee 814 Nearly a Posting Maven · Answer 3 · 2011-06-21T22:21:12+00:00

The original problem is that the OP is iterating through the list

l=list(s) #['j', 'a', 'n', 'k', 'o', 's', 'i', 'e', 'n', 'a', 'p', 'i', 'v', 'o']
f=[]
for i in l:   ## <-- iterating through the input list
    f.append(l.count(i)) # [1, 2, 2, 1, 2, 1

instead of through a list of letters, which would eliminate the duplicate problem.

l=list(s) #['j', 'a', 'n', 'k', 'o', 's', 'i', 'e', 'n', 'a', 'p', 'i', 'v', 'o']
f=[]
##for i in l:
for i in string.lowercase:   ## "abcdef...etc" --> count each letter once
    f.append(l.count(i)) # [1, 2, 2, 1, 2, 1

but since this is homework, no one wants to give out a complete solution/

vlady 0 Junior Poster in Training · Answer 4 · 2011-06-22T15:32:04+00:00

thank you all of you! Here is my version.

def most_frequent(s):
    t=s.split()
    delimiter= ''
    s=delimiter.join(t) # jankosielnapivo
    slabiky=list(s) #['j', 'a', 'n', 'k', 'o', 's', 'i', 'e', 'n', 'a', 'p', 'i', 'v', 'o']
    frekvencia=[]
    for i in slabiky:
        frekvencia.append(slabiky.count(i)) # [1, 2, 2, 1, 2, 1, 2, 1, 2, 2, 1, 2, 1, 2]
    tup=zip(slabiky,frekvencia)

    vysledok=[]
    for i in tup:
        if i not in vysledok:
            vysledok.append(i)
    vysledok.sort(reverse=False)
    return vysledok

def main():
    s='janko siel na pivo'
    print most_frequent(s)

if __name__ == '__main__':
    main()