How do I do it?

I tried sorting the keys:

keys = dictionary.keys()
    keys.sort()
    return map(dictionary.get, keys)

But it didn't work.

Recommended Answers

All 14 Replies

Here is an example of my code with this
Every tutorial I go online tells me this is how I am supposed to sort a dictionary. But it doesn't work. :'(

import string

def read_book():
    f = open("alice_in_wonderland.txt", "r")
    word_list = []
    for l in f.readlines()[10:3340]:
        book_line = l.strip().translate(None, string.punctuation)
        word_freq = {}
        for w in book_line.split(" "):
            if w != "":
                word_list.append(w.lower())
                for  w in word_list:
                    word_freq[w] = word_freq.get(w, 0) + 1
                    keys = dictionary.keys()
                    keys.sort()
                    return map(dictionary.get, keys)

Your logic is not good. Write an algorithm

def read_book():
    create word frequency dictionary (only once, not in a loop)
    open the file
    for each line in the file:
        split the line into words
        for each word in the line:
            increment the word's frequency (we don't need a word list)
    sort the frequency dict items (again only once, not in a loop)

The implementation must follow the same logic and the same indentation scheme.

Following your instructions to the best of my abilities here is my new code:
I just get a bunch of random numbers.

import string

def read_book():
    word_freq = {}
    f = open("alice_in_wonderland.txt", "r")
    for l in f.readlines()[10:3340]:
        book_line = l.strip().translate(None, string.punctuation)
        for w in book_line.split(" "):
            if w != "":
                word_freq[w] = word_freq.get(w, 0) + 1
                keys = word_freq.keys()
    keys.sort()
    return map(word_freq.get, keys)

Here we sort with reverse frequency, and show ten most common:

import string

def read_book():
    word_freq = {}
    f = open("alice.txt", "r")
    for l in f.readlines()[10:3340]:
        book_line = l.strip().translate(None, string.punctuation).lower()
        for w in book_line.split(" "):
            if w != "":
                word_freq[w] = word_freq.get(w, 0) + 1
    return sorted(word_freq.items(), reverse=True, key=lambda x: x[1])

print read_book()[:10]
"""Output:
[('the', 1591), ('and', 827), ('to', 713), ('a', 624), ('she', 529), ('it', 526), ('of', 492), ('said', 462), ('i', 400), ('alice', 385)]
"""

Here we sort with reverse frequency, and show ten most common:

import string

def read_book():
    word_freq = {}
    f = open("alice.txt", "r")
    for l in f.readlines()[10:3340]:
        book_line = l.strip().translate(None, string.punctuation).lower()
        for w in book_line.split(" "):
            if w != "":
                word_freq[w] = word_freq.get(w, 0) + 1
    return sorted(word_freq.items(), reverse=True, key=lambda x: x[1])

print read_book()[:10]
"""Output:
[('the', 1591), ('and', 827), ('to', 713), ('a', 624), ('she', 529), ('it', 526), ('of', 492), ('said', 462), ('i', 400), ('alice', 385)]
"""

Thanks Tony.
This is my code using a modified version of your return sorted() statement.

import string

def read_book():
    word_freq = {}
    f = open("alice_in_wonderland.txt", "r")
    for l in f.readlines()[10:3340]:
        book_line = l.strip().translate(None, string.punctuation)
        for w in book_line.split(" "):
            if w != "":
                w.lower()
                word_freq[w] = word_freq.get(w, 0) + 1
    return sorted(word_freq.items(), key=lambda x: x[1])[-20:-1]

It works!...sorta. mine says and is the most frequently occurring word(774 times).
impossible since we are using the same book, but I think I know why.

Are you sure you do not need the .lower() at line 7. It does change the result.

Without lower:

[('the', 1476), ('and', 757), ('to', 709), ('a', 607), ('she', 490), ('it', 482), ('of', 477), ('said', 456), ('I', 400), ('Alice', 385)]

Are you sure you do not need the .lower() at line 7. It does change the result.

Without lower:

[('the', 1476), ('and', 757), ('to', 709), ('a', 607), ('she', 490), ('it', 482), ('of', 477), ('said', 456), ('I', 400), ('Alice', 385)]

I put it in line 9 because that's right before I use w in word_freq. It made sense to me.
I think it worked.

Here is my final code:

import string

def read_book():
    word_freq = {}
    f = open("alice_in_wonderland.txt", "r")
    for l in f.readlines()[10:3340]:
        book_line = l.strip().translate(None, string.punctuation)
        for w in book_line.split(" "):
            if w != "":
                w.lower()
                word_freq[w] = word_freq.get(w, 0) + 1
    return sorted(word_freq.items(), key=lambda x: x[1])[::-1][0:20]

Oh, yes you have w.lower() at line 10, but it does nothing as it is not saved anywhere, so you still have case sensitive count.
I would also say line 9 as

if w:

Ahh yes you are right.

Oh, yes you have w.lower() at line 10, but it does nothing as it is not saved anywhere, so you still have case sensitive count.
I would also say line 9 as

if w:

I put .lower() in line 10:
For some reason I get more than you.

import string

def read_book():
    word_freq = {}
    f = open("alice_in_wonderland.txt", "r")
    for l in f.readlines()[10:3340]:
        book_line = l.strip().translate(None, string.punctuation)
        for w in book_line.split(" "):
            if w != "":
                word_freq[w.lower()] = word_freq.get(w.lower(), 0) + 1
    return sorted(word_freq.items(), key=lambda x: x[1])[::-1][0:20]

Output:

[('the', 1629), ('and', 844), ('to', 721), ('a', 627), ('she', 537), ('it', 526), ('of', 508), ('said', 462), ('i', 399), ('alice', 385), ('in', 365), ('you', 360), ('was', 357), ('that', 276), ('as', 262), ('her', 248), ('at', 209), ('on', 193), ('with', 180), ('all', 179)]

You must have missing some license texts from file. If I remove all those and run the indexing in all words for all file, I get same frequences as you.

You can't sort a dictionary, it is in no particular order, but you can read from it in the correct order.

dict = {3: "three", 1: "one", 2: "two"}
keys = [i for i in dict]
keys.sort()
for i in keys: print (dict[i])

This thread is off the hook. Is this about reordering a dict in order(sorting) or printing and finding some words and their locations + ordering them ????

???? ;)

So is this solved?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.