| | |
searching 2nd item in each sublist - so close!
Please support our Python advertiser: Programming Forums - DaniWeb Sister Site
Thread Solved |
•
•
Join Date: May 2004
Posts: 217
Reputation:
Solved Threads: 0
Hey guys,
I'm working on a basic search engine and am really close to completion.
I currently have a function that takes a string and compares each word and its synonyms to a webpage.
My output at the moment is [("closeness" percentage of terms to webpage, webpage contents,(x,y),(x,y)...(x,y)]
I am almost there, but I now need to remove the items that have no match to a site (ie, where x = 0.
I have found out that the itemgetter() function isolated just the first variables, then I filtered out the zeros from there with this code
So any example of the output would be
[13, 0, 3, 2, 0, 0, 4, 0, 0, 6, 2, 3, 0, 0]
[13, 3, 2, 4, 6, 2, 3]
This is good, however, deleting the zeros from percentage only list does not correlate to them being deleted from the list with the webpages - obviously as its a new list!
I have been straining my brain for hours about how to get around this! I think I need to make a loop that compares the 2nd value in each SUBLIST to the values of the original list, then if its a match return true, then filter the results! But i dont know how to do something like
Do you guys get what I mean? Or is there a much easier way to omit the zeros from the original list?
Thanks heaps in advance!
ps. Ive attached the file (rename to .py if you want to use it)..so its easier to understand whats going on as this is part 4 and each part is dependant on the others before it (thought it would be too much code for a post)!
or get them here
Python File
As .txt
I'm working on a basic search engine and am really close to completion.
I currently have a function that takes a string and compares each word and its synonyms to a webpage.
My output at the moment is [("closeness" percentage of terms to webpage, webpage contents,(x,y),(x,y)...(x,y)]
I am almost there, but I now need to remove the items that have no match to a site (ie, where x = 0.
I have found out that the itemgetter() function isolated just the first variables, then I filtered out the zeros from there with this code
Python Syntax (Toggle Plain Text)
def Google_search(string): internet_length = len(Internet) percentage_list = [] for x in range(0,internet_length): position = x closeness_percentage = closeness(string, Internet[x]) percentage_list.append([closeness_percentage, Internet[position]]) sorted_list = sorted(percentage_list, key=operator.itemgetter(1), reverse = True) ## print sorted_list ## now to delete the ones with zero percentage get_percentages = operator.itemgetter(0) percentages = map(get_percentages, sorted_list) print percentages no_zeros = [x for x in percentages if x is not 0] print no_zeros print sorted_list
So any example of the output would be
[13, 0, 3, 2, 0, 0, 4, 0, 0, 6, 2, 3, 0, 0]
[13, 3, 2, 4, 6, 2, 3]
This is good, however, deleting the zeros from percentage only list does not correlate to them being deleted from the list with the webpages - obviously as its a new list!
I have been straining my brain for hours about how to get around this! I think I need to make a loop that compares the 2nd value in each SUBLIST to the values of the original list, then if its a match return true, then filter the results! But i dont know how to do something like
Python Syntax (Toggle Plain Text)
for x in range(0, length): for y in range(0, no_zeros_length): if sorted_list[x].itemgetter(1) == no_zeros: return true
Do you guys get what I mean? Or is there a much easier way to omit the zeros from the original list?
Thanks heaps in advance!
ps. Ive attached the file (rename to .py if you want to use it)..so its easier to understand whats going on as this is part 4 and each part is dependant on the others before it (thought it would be too much code for a post)!
or get them here
Python File
As .txt
Last edited by marceta; Apr 17th, 2008 at 8:53 pm.
•
•
Join Date: Jul 2006
Posts: 608
Reputation:
Solved Threads: 150
I think you want a dictionary, if I understand correctly. The dictionary is the standard way of mapping one set of items to another.
So you have
mydict = {URL1: 13, URL2: 0, URL3: 3, URL4: 2, ...}
And then you run this bit of code:
and then your list of hot URLs is simply mydict.keys().
Jeff
So you have
mydict = {URL1: 13, URL2: 0, URL3: 3, URL4: 2, ...}
And then you run this bit of code:
Python Syntax (Toggle Plain Text)
for URL in mydict.copy(): if mydict[URL] == 0: mydict.pop(URL)
and then your list of hot URLs is simply mydict.keys().
Jeff
![]() |
Similar Threads
- Source Code that don't work? (Java)
Other Threads in the Python Forum
- Previous Thread: scheduler
- Next Thread: Efficiency?
| Thread Tools | Search this Thread |
Tag cloud for Python
ansi assignment avogadro backend beginner binary bluetooth character cmd code copy customdialog data decimals dictionary drive dynamic error examples excel exe file float format ftp function gnu graphics gui heads homework http ideas import input java leftmouse line linux list lists logging loop module mouse newb number numbers output parsing path pointer port prime program programming progressbar projects push py2exe pygame pyglet pyqt python random recursion recursive refresh schedule scrolledtext sqlite ssh statistics stdout string strings sudokusolver sum table terminal text thread threading time tkinter tlapse tricks tuple tutorial ubuntu unicode update urllib urllib2 variable wikipedia windows write wxpython xlib





