I'm new to programming and am learning with Python. I am trying to write a program that will get information from a website and display it in a program. Specifically, I'm writing a stock portfolio program that will look up information on a specific ticker and return that information. Reading through the Python documentation, I've figured out the following:
I can open the page of interest using this method:
import urllib2 ticker='GOOG' tickerurl='http://finance.yahoo.com/q?s='+ticker pagehtml=urllib2.urlopen(tickerurl)
This returns a big string with the HTML code for the page. I don't know what to do at this point, though, to get the info I want from the page. I could do something like:
ticker=raw_input('Enter the ticker symbol: ') ticker='http://finance.yahoo.com/q?s='+ticker tickerhtml=urllib2.urlopen(ticker) for i in tickerhtml.readlines(): print i if 'Last' in i: htmltext=i quoteindex=htmltext.find('Last') htmltext=htmltext[quoteindex:] print htmltext[:100] quoteindex=htmltext.find(ticker+'">') htmltext=htmltext[quoteindex:] price=htmltext[:4] print htmltext[:100] print ticker, " last traded at ",price
I came up with this by looking at the source code that urlopen returns, and essentially searching for the words 'last trade', and then searching for the first instance after that of ticker+'">', because those are the characters that immediately preced the quote on the source page. I am sure that there must be a better, easier way to do this, though. Particularly because I'm sure the source code changes from page to page and this method may not work for different stocks. And say I want to pull up other information, like last trade time, volume, market cap, etc?
Any advice appreciated.