im trying to extract the top 10 links from a yahoo search results page. i can get all the links using the code below.. but that could be 70 links.
Any idea how i could get just those top 10 ranked ones? and not the adverts etc.
ie for this page..
heres main lump of my code that returns ALL links on that page.
Is there even anything to distinguish which are in the top ten that way i could try extract them.
if __name__ == "__main__": import urllib usock = urllib.urlopen("http://uk.search.yahoo.com/search?p=python&fr=yfp-t-501&ei=UTF-8&meta=vc%3D") parser = URLLister() parser.feed(usock.read()) parser.close() usock.close() path = u"c:\\Users\\admin\\Desktop\\" i = 0 for url in parser.urls: if i <= (len(parser.urls)): print i print parser.urls[i] page = urllib.urlopen(parser.urls[i]).read() f = file(path + u"test" + str(i) + u".txt", "w+") print >> f, page f.close() print "Html file successfully printed to file!"
any help appreciated,
thanks guys :)