943,786 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Marked Solved
  • Views: 751
  • Python RSS
Apr 17th, 2008
0

searching 2nd item in each sublist - so close!

Expand Post »
Hey guys,

I'm working on a basic search engine and am really close to completion.

I currently have a function that takes a string and compares each word and its synonyms to a webpage.

My output at the moment is [("closeness" percentage of terms to webpage, webpage contents,(x,y),(x,y)...(x,y)]

I am almost there, but I now need to remove the items that have no match to a site (ie, where x = 0.

I have found out that the itemgetter() function isolated just the first variables, then I filtered out the zeros from there with this code

Python Syntax (Toggle Plain Text)
  1. def Google_search(string):
  2. internet_length = len(Internet)
  3. percentage_list = []
  4.  
  5. for x in range(0,internet_length):
  6. position = x
  7. closeness_percentage = closeness(string, Internet[x])
  8. percentage_list.append([closeness_percentage, Internet[position]])
  9.  
  10. sorted_list = sorted(percentage_list, key=operator.itemgetter(1), reverse = True)
  11. ## print sorted_list
  12.  
  13. ## now to delete the ones with zero percentage
  14.  
  15.  
  16. get_percentages = operator.itemgetter(0)
  17. percentages = map(get_percentages, sorted_list)
  18. print percentages
  19. no_zeros = [x for x in percentages if x is not 0]
  20. print no_zeros
  21. print sorted_list

So any example of the output would be
[13, 0, 3, 2, 0, 0, 4, 0, 0, 6, 2, 3, 0, 0]
[13, 3, 2, 4, 6, 2, 3]

This is good, however, deleting the zeros from percentage only list does not correlate to them being deleted from the list with the webpages - obviously as its a new list!

I have been straining my brain for hours about how to get around this! I think I need to make a loop that compares the 2nd value in each SUBLIST to the values of the original list, then if its a match return true, then filter the results! But i dont know how to do something like

Python Syntax (Toggle Plain Text)
  1. for x in range(0, length):
  2. for y in range(0, no_zeros_length):
  3. if sorted_list[x].itemgetter(1) == no_zeros:
  4. return true

Do you guys get what I mean? Or is there a much easier way to omit the zeros from the original list?

Thanks heaps in advance!

ps. Ive attached the file (rename to .py if you want to use it)..so its easier to understand whats going on as this is part 4 and each part is dependant on the others before it (thought it would be too much code for a post)!

or get them here

Python File

As .txt
Attached Files
File Type: txt ASSIGNMENT_almost_done.txt (25.0 KB, 12 views)
Last edited by marceta; Apr 17th, 2008 at 8:53 pm.
Similar Threads
Reputation Points: 13
Solved Threads: 0
Posting Whiz in Training
marceta is offline Offline
217 posts
since May 2004
Apr 18th, 2008
0

Re: searching 2nd item in each sublist - so close!

I think you want a dictionary, if I understand correctly. The dictionary is the standard way of mapping one set of items to another.

So you have

mydict = {URL1: 13, URL2: 0, URL3: 3, URL4: 2, ...}

And then you run this bit of code:

Python Syntax (Toggle Plain Text)
  1.  
  2. for URL in mydict.copy():
  3.  
  4. if mydict[URL] == 0:
  5. mydict.pop(URL)

and then your list of hot URLs is simply mydict.keys().

Jeff
Reputation Points: 92
Solved Threads: 156
Practically a Master Poster
jrcagle is offline Offline
608 posts
since Jul 2006

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: scheduler
Next Thread in Python Forum Timeline: Efficiency?





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC