0

hi guys im trying to make the words that have capital letters in small caps, then count the numbers of times a certain words appears in the text.

i managed to count the number of times a word appears but now im having trouble converting upperletters to caps on all words.

here is my code:

words = {}
file = open ('input.txt', 'r')                    
for line in file:                             
    wordlist = line.split()

    for word in wordlist:

       [B] if words.upper(word):
            words[word]=words.lower(word)[/B]   [B]<<< not working for conversion but not sure if its correct[/B]        

        if words.has_key(word):
            words[word]=words[word]+1
        else:
            words[word] = 1       
for word in words:
        print word, words[word]

thank you

2
Contributors
7
Replies
8
Views
8 Years
Discussion Span
Last Post by jvignacio
0

how about this

from collections import defaultdict
words = defaultdict(int)
for word in (w.lower() for w in open("input.txt").read().split()):
    words[word] += 1
for word, cnt in sorted(words.items()):
    print word, cnt
0

how about this

from collections import defaultdict
words = defaultdict(int)
for word in (w.lower() for w in open("input.txt").read().split()):
    words[word] += 1
for word, cnt in sorted(words.items()):
    print word, cnt

thanks mate! theres an error reading the first line
"from collections import defaultdict"

", line 1, in ?
from collections import defaultdict
ImportError: cannot import name defaultdict

any ideas?

0

Well defaultdict's are new in python 2.5. May be your python is 2.4 ?

yes that and it says "print word, cnt" on the last line is an invalid syntax? very wierd

0

yes that and it says "print word, cnt" on the last line is an invalid syntax? very wierd

sorry i was using 3.0, i went down to 2.5 and it worked.. silly versions always change things.

are u able to explain with lil comments beside each line what they do ? if u cant i understand. much appreciated.

0

Ok

from collections import defaultdict
# create a defaultdict (a dictionary with a default constructor for missing keys)
# see [url]http://docs.python.org/dev/library/collections.html#id3[/url]
words = defaultdict(int)
# open("input.text").read() returns the whole content of the file as a single string
# the .split() method cuts this string on white space, returning a list of non white
# blocks (words ?)
# w.lower() returns the word w with all letters in lowercase
# (w.lower() for w in ...) is an iterator over all words in the file, in lowercase
for word in (w.lower() for w in open("input.txt").read().split()):
    # add 1 to the count of this word in the dictionary words
    # if the word isn't already there, defaultdict creates an initial value by
    # calling int() which returns 0
    words[word] += 1
# words.items() is a list of all pairs (key, value) in the dictionary
# sorted( theList) returns a new list with the same items, but sorted
for word, cnt in sorted(words.items()):
    print word, cnt

I hope it explains a little :)

0

Ok

from collections import defaultdict
# create a defaultdict (a dictionary with a default constructor for missing keys)
# see [url]http://docs.python.org/dev/library/collections.html#id3[/url]
words = defaultdict(int)
# open("input.text").read() returns the whole content of the file as a single string
# the .split() method cuts this string on white space, returning a list of non white
# blocks (words ?)
# w.lower() returns the word w with all letters in lowercase
# (w.lower() for w in ...) is an iterator over all words in the file, in lowercase
for word in (w.lower() for w in open("input.txt").read().split()):
    # add 1 to the count of this word in the dictionary words
    # if the word isn't already there, defaultdict creates an initial value by
    # calling int() which returns 0
    words[word] += 1
# words.items() is a list of all pairs (key, value) in the dictionary
# sorted( theList) returns a new list with the same items, but sorted
for word, cnt in sorted(words.items()):
    print word, cnt

I hope it explains a little :)

appreciate it man! i understand it now..
im trying to view it on a html page now.

this is my code so far:

from collections import defaultdict
outfile = open("test.html", "w")
words = defaultdict(int)

print >>outfile, """<html>
<head>
<title>Words & frequency table</title>
</head>
<body>
<table border="1">"""


print >>outfile, "<tr><th>Words</th><th>Frequency</th></tr>"


for word in (wordz.lower() for wordz in open("input.txt").read().split()):
    words[word] = words[word] + 1
for word, cnt in sorted(words.items()):
    print >> outfile, "<tr><td>",word,"</td><td>", cnt,"</td></tr>"

print >>outfile, "</table></body></html>"

but its outputing word and cnt several times.. any ideas ? cheers
EDIT: i just fixed it!! its the code above.. thanks anyway :)

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.