I know that this is an easy one but it is hard for me to figure out.
I need to make a program that can:
* Open the file and read through all of the lines in the file. (got this)
* If a line starts with "Subject:", skip the line. (got this)
* For all the other lines, split the strings into a list of words
using the white space and find the first word. After you have
found the first word, split that string again into a *new* list of
words using the "/" character. (got this I think with the words=line.split())
* Look at the second word of the *new* list. If it does not contain
"trunk" or "branches", skip the line. (stuck on this)
* For the lines we still have left, look at the first word of the
*new* list, and use that word to index your dictionary of running
counts for each module. (stuck on this)
* At the end of the program, print out your dictionary of counts

What I don't understand is how to split the line a second time to look at the second word that the above guideline is referring to. That is stopping me from even beginning the search for the words 'branches' and 'trunk'. Here is the code that I have written so far:

file = raw_input('Enter a file name: ')
    try:
        fhand = open(file)
    except:
        'file cannot be opened:', file

    counts=dict()
    for line in fhand:
        if line.startswith('Subject:'):
            continue
        else:
            words=line.split()
            modified=words[0]
            if len(words) == 0 : continue
            modified=modified.split('/')
            x=modified[1]
            if len(modified) == 0 : continue
            if x not in counts:
                counts[x]=1
            else:
                counts[x] = counts[x]+1
               
    lst = list()
    for val, key in counts.items():
        lst.append( (val, key) )
       
    lst.sort()

    for val, key in lst[:] :
        print val, key

I would greatly appreciate any advice. Thank you in advance!

Also, if you can't offer any advice but might know a place where I could read information that explicitly deals with a problem like this, I would really appreciate that too. Thanks again.

I only got a minute to so I can only offer a bit of advice right now.
Don't use "file" as a variable name, because it is a built in function. Use something like "file1" instead

This line does nothing

if len(modified) == 0 : continue

Look at the second word of the *new* list. If it does not contain
"trunk" or "branches", skip the line. (stuck on this)

Use the "in" keyword.

**---  will return a positive for words like "strunk"
found = False
for word in ["trunk", "branches"] :
    if word in words[1]:
        found = True
if not found:
    print "processing this"

Edited 6 Years Ago by woooee: n/a

Comments
this is helpful

wooee's help is really good, but I don't see where he is inserting the "in" code. I don't see anywhere in my code where I can use a loop that starts with

**--- will return a positive for words like "strunk"
found = False

I am not sure what **--- does and I don't understand where he got the word "strunk". Are these lines that he commented out but I'm not seeing the '#' symbol for some reason?

Thank you.

Edited 6 Years Ago by doctorjo5: n/a

Use the "in" keyword.

**---  will return a positive for words like "strunk"
found = False
for word in ["trunk", "branches"] :
    if word in words[1]:
        found = True
if not found:
    print "processing this"

Wouldn't it make sense to do this the opposite way? i.e.

if words[1] in ["trunk", "branches"]:
    found = True
else:
    found = False

It depends on whether you want to find sub-words or not. The simple example below shows the difference.

words_master = [ "the", "there", "theres"]

##--- look for word in list
word = "there"
if word in words_master:
    print(word, "found")
else:
    print(word, "Not found")

##--- compare each word in the list to the original word
print("-"*30)
for m_word in words_master:
    if m_word in word:
        print("%s found in %s" % (m_word, word))
This question has already been answered. Start a new discussion instead.