954,546 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

searching 2nd item in each sublist - so close!

Hey guys,

I'm working on a basic search engine and am really close to completion.

I currently have a function that takes a string and compares each word and its synonyms to a webpage.

My output at the moment is [("closeness" percentage of terms to webpage, webpage contents,(x,y),(x,y)...(x,y)]

I am almost there, but I now need to remove the items that have no match to a site (ie, where x = 0.

I have found out that the itemgetter() function isolated just the first variables, then I filtered out the zeros from there with this code

def Google_search(string):
    internet_length = len(Internet)
    percentage_list = []
    
    for x in range(0,internet_length):
        position = x
        closeness_percentage = closeness(string, Internet[x])
        percentage_list.append([closeness_percentage, Internet[position]])

    sorted_list = sorted(percentage_list, key=operator.itemgetter(1), reverse = True)
##    print sorted_list
    
    ## now to delete the ones with zero percentage


    get_percentages = operator.itemgetter(0)
    percentages = map(get_percentages, sorted_list)
    print percentages
    no_zeros = [x for x in percentages if x is not 0]
    print no_zeros
    print sorted_list


So any example of the output would be
[13, 0, 3, 2, 0, 0, 4, 0, 0, 6, 2, 3, 0, 0]
[13, 3, 2, 4, 6, 2, 3]

This is good, however, deleting the zeros from percentage only list does not correlate to them being deleted from the list with the webpages - obviously as its a new list!

I have been straining my brain for hours about how to get around this! I think I need to make a loop that compares the 2nd value in each SUBLIST to the values of the original list, then if its a match return true, then filter the results! But i dont know how to do something like

for x in range(0, length):
     for y in range(0, no_zeros_length):
           if sorted_list[x].itemgetter(1) == no_zeros:
                   return true


Do you guys get what I mean? Or is there a much easier way to omit the zeros from the original list?

Thanks heaps in advance!

ps. Ive attached the file (rename to .py if you want to use it)..so its easier to understand whats going on as this is part 4 and each part is dependant on the others before it (thought it would be too much code for a post)!

or get them here Python File

As .txt

Attachments ASSIGNMENT_almost_done.txt (25.05KB)
marceta
Posting Whiz in Training
217 posts since May 2004
Reputation Points: 13
Solved Threads: 0
 

I think you want a dictionary, if I understand correctly. The dictionary is the standard way of mapping one set of items to another.

So you have

mydict = {URL1: 13, URL2: 0, URL3: 3, URL4: 2, ...}

And then you run this bit of code:

for URL in mydict.copy():

   if mydict[URL] == 0:
        mydict.pop(URL)


and then your list of hot URLs is simply mydict.keys().

Jeff

jrcagle
Practically a Master Poster
608 posts since Jul 2006
Reputation Points: 92
Solved Threads: 156
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You