Hi guys,

I'm new to programming and am learning with Python. I am trying to write a program that will get information from a website and display it in a program. Specifically, I'm writing a stock portfolio program that will look up information on a specific ticker and return that information. Reading through the Python documentation, I've figured out the following:

I can open the page of interest using this method:

import urllib2
ticker='GOOG'
tickerurl='http://finance.yahoo.com/q?s='+ticker
pagehtml=urllib2.urlopen(tickerurl)

This returns a big string with the HTML code for the page. I don't know what to do at this point, though, to get the info I want from the page. I could do something like:

ticker=raw_input('Enter the ticker symbol: ')
        ticker='http://finance.yahoo.com/q?s='+ticker
        tickerhtml=urllib2.urlopen(ticker)
        for i in tickerhtml.readlines():
                print i
                if 'Last' in i:
                        htmltext=i
        
        quoteindex=htmltext.find('Last')
        htmltext=htmltext[quoteindex:]
        print htmltext[:100]

        quoteindex=htmltext.find(ticker+'">')
        htmltext=htmltext[quoteindex:]
        price=htmltext[:4]
        print htmltext[:100]


        print ticker, " last traded at ",price

I came up with this by looking at the source code that urlopen returns, and essentially searching for the words 'last trade', and then searching for the first instance after that of ticker+'">', because those are the characters that immediately preced the quote on the source page. I am sure that there must be a better, easier way to do this, though. Particularly because I'm sure the source code changes from page to page and this method may not work for different stocks. And say I want to pull up other information, like last trade time, volume, market cap, etc?

Any advice appreciated.

Thanks,
Luke

Thanks. I downloaded BeautifulSoup and used it to parse a sample page, but now all I have is a slightly nicer looking version of the huge string of HTML code from the page in question. I still don't know how to extract specific items of interest (like price, etc.) from this huge string.

BeautifulSoup has functions find , findAll , and related functions which should help you. Try to learn how to use them.

This article has been dead for over six months. Start a new discussion instead.