Hi everyone,

I'm working on a homework assignment and i'm brand new to python (i'm a better c++ coder). Anyway, I've used help from classmates and google to compile this code chunk and i need to know how i can make this into a loop so i dont have 26 code chunks.

This code chunks reads a bunch of text from a file and looks at each word and determines fi it begins with 'x' letter (a or b...etc) and then prints out its frequency.

So basically i'm looking for the number of words, in the text, that begin with this letter and that letter.

Thanks for your help!

import sys
result = []
f = open("sample_text.txt")
for line in f:
	for word in line.split():
		if word.startswith('a'):
			result.append(word)
result_length = len(set(result))
print "\nTotal DISTINCT words starting with 'a': ", result_length

import sys
result = []
f = open("sample_text.txt")
for line in f:
	for word in line.split():
		if word.startswith('b'):
			result.append(word)
result_length = len(set(result))
print "Total DISTINCT words starting with 'b': ", result_length

On down too 'Z'...

import sys
result = []
f = open("sample_text.txt")
for line in f:
	for word in line.split():
		if word.startswith('z'):
			result.append(word)
result_length = len(set(result))
print "Total DISTINCT words starting with 'z': ", result_length

I have this down 26 times - but obviously that just doesn't work and is bad coding. Any help is greatly appreciated!

Thanks!

indentations is 4 spaces,it work with 2 and 8 but never ever us anything else than 4.

A simple fix move line 8 so it get input from loop.
Remove set

import sys
result = []
f = open("sample_text.txt")
for line in f:
	for word in line.split():
		if word.startswith('h'):
			result.append(word)
                        result_length = len(result)
print "Total DISTINCT words starting with 'z': ", result_length

Here is the script with right indent,and som print to see what happends.

my sample_text.txt

hi this is an test.
hi again why not.
hi number 3.
result = []
f = open("sample_text.txt")
for line in f:
    for word in line.split():
        if word.startswith('h'):
            result.append(word)            
            print result  # use print as help easier to see what happends
            print len(result)  # use  print as help easier to see what happends      
print "Total DISTINCT words starting with '%s' is %d" % (result[0][0], len(result)) 

'''
my output-->
['hi']
1
['hi', 'hi']
2
['hi', 'hi', 'hi']
3
Total DISTINCT words starting with 'h' is 3
'''

Edited 7 Years Ago by snippsat: n/a

Comments
very helpful

Excellent! Thanks!

But is there any way to incorporate the whole alphabet into the

if word.startswith('x')

(replacing the spot of x with a-z) statement so i dont have to have 26 separate statements? Because i need to find the number of words that begin with every letter.

How would i incorporate that?

A dictionary with the letter as the key is the straight forward way to do this. Otherwise use a function with a for loop to pass the letter and the string to the function.

import string

letter_dict = {}
for letter in string.ascii_uppercase:
    letter_dict[letter] = 0

f = [ "abc def\n",
      "ghi abc def\n",
      "mno stuvw abc\n" ]

for line in f:
   for word in line.strip().split():
      letter = word[0].upper()
      letter_dict[letter] += 1

for letter in string.ascii_uppercase:
    print "letter %s = %d" % (letter, letter_dict[letter])

Edited 7 Years Ago by woooee: n/a

Comments
very helpful

Thanks buddy!

That is some code i would have never been able to write in 10 years. Very nice. Thanks so much for your help :)

This article has been dead for over six months. Start a new discussion instead.