954,557 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Getting information from web page with Python

DOWNLOAD ALLOWANCE STATUSUsage within allowance - no download restrictions100%                                                                                                    0%  Plan Allowance (MB) 625 Allowance Remaining (MB) 625    Allowance Remaining (%) 100 Time Until Allowance Refill 22:18:01

aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 

I think what you need is the re module.

Tech B
Posting Whiz in Training
268 posts since May 2009
Reputation Points: 59
Solved Threads: 33
 

Kodos The Python Regex Debugger suggests the following ways to find Plan Allowance value.

#!/usr/bin/env python
import re
rawstr = r"""Plan Allowance \(MB\)</td><td style="border-width:0px;">\s*(?P<Plan_Allowance>\d+)\D"""
embedded_rawstr = r"""Plan Allowance \(MB\)</td><td style="border-width:0px;">\s*(?P<Plan_Allowance>\d+)\D"""
matchstr = """<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head>
<meta http-equiv="refresh" Content="60;url=/stlui/user/allowance_request.html">
<title>Download Allowance Status</title></head><body><h3 style="font-size: 150%; color:blue; text-align:center; font-weight:bold">DOWNLOAD ALLOWANCE STATUS</h3><TABLE width='100%' cellpadding='10'><tr><td style="text-align:center;"><h3>Usage within allowance - no download restrictions</h3></td><td><span style="display:block;text-align:center">100%</span><table style="font-size:1; background-color:white; border:1px solid black;" width='15%' cellspacing='0' cellpadding='0' align='center'><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr></table><span style="display:block;text-align:center">0%</span></td></tr><tr><td>&nbsp;</td><td><table style="font-size:18; border-width:0px;" width=90% cellspacing=0 cellpadding=0 ALIGN=CENTER><tr><td>&nbsp;</td><td style="border-width:0px;">Plan Allowance (MB)</td><td style="border-width:0px;"> 625</td></tr><tr><td>&nbsp;</td><td style="border-width:0px;">Allowance Remaining (MB)</td><td style="border-width:0px;"> 625</td></tr><tr><td><span style="background-color: green;">&nbsp;&nbsp;&nbsp;</span>&nbsp;</td><td style="border-width:0px;">Allowance Remaining (%)</td><td style="border-width:0px;"> 100</td></tr><tr><td>&nbsp;</td><td style="border-width:0px;">Time Until Allowance Refill</td><td style="border-width:0px;"> 22:18:01</td></tr></table></td></tr></TABLE>
</body>
</html>"""

# method 1: using a compile object
compile_obj = re.compile(rawstr)
match_obj = compile_obj.search(matchstr)

# method 2: using search function (w/ external flags)
match_obj = re.search(rawstr, matchstr)

# method 3: using search function (w/ embedded flags)
match_obj = re.search(embedded_rawstr, matchstr)

# Retrieve group(s) from match_obj
all_groups = match_obj.groups()

# Retrieve group(s) by index
group_1 = match_obj.group(1)

# Retrieve group(s) by name
Plan_Allowance = match_obj.group('Plan_Allowance')

print "Plan Allowance is: " + Plan_Allowance
d5e5
Practically a Posting Shark
810 posts since Sep 2009
Reputation Points: 159
Solved Threads: 159
 

Thanks. This will certainly keep me busy for a few days trying to digest what the code is doing.
Thanks again for your time.
Frank

aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 

You could also use the built functions.

#This would be the html page, but its too much for a simple example
s = """This is the html page your going to look through to find
numbers like Plan Allowance (MB)</td><td style="border-width:0px;"> 625</td>"""

#The string your looking for; always up to but not including the term your trying
#find.
planAllow = """Plan Allowance (MB)</td><td style="border-width:0px;"> """

#slice out the value your looking for using the index function
value = s[s.index(planAllow)+len(planAllow):s.index(planAllow)+len(planAllow)+3]


Adding the length to the index will leave out theconstant text, I.E. the text that never changes like html tags.

Adding the 3 at the end of the second index will add the three characters after the constant text giving us a slice of the value only. Assuming its a three digit number.

I would add a larger number like 4 or 5 and strip off the trailing tags from the number to ensure I get the entire value.

Tech B
Posting Whiz in Training
268 posts since May 2009
Reputation Points: 59
Solved Threads: 33
 
You could also use the built functions.


Hey all right. This did the job and it was simple enough that even this South Texas "hick" could understand it.
Thanks
Frank

aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 

Don't feel like a hick. I was born and raised in the redneck hills of WV. I've done IR tracking, face recognition, and developed a hat that quadriplegics can put on to use a computer; based on an accelerometer.

It doesn't depend on location; just how hard you work and willing to learn.

Anyway, sorry about the ramblings lol.

Tech B
Posting Whiz in Training
268 posts since May 2009
Reputation Points: 59
Solved Threads: 33
 

Seconded about not feeling like a Hick. In Hendersonville, TN here.

Mensa180
Light Poster
31 posts since Oct 2009
Reputation Points: 13
Solved Threads: 7
 

Don't feel like a hick. I was born and raised in the redneck hills of WV. I've done IR tracking, face recognition, and developed a hat that quadriplegics can put on to use a computer; based on an accelerometer.

It doesn't depend on location; just how hard you work and willing to learn.

Anyway, sorry about the ramblings lol.


Hope I didn't offend anyone with my comment. It was totally directed towards myself. Any way here is the code I have come up with to this point. I am sure it can be improved.I someone cares to offer suggestions it would be great. I have yet to figure out how to strip the trailing tags.
Thanks for your time and help.

#!/usr/bin/env python
import time
import os
from urllib import urlopen
while True:
    print " " 
    print "Updating Plan Status........" 
    time.sleep(60)
    os.system("clear")
    myfile = urlopen('http://192.168.0.1/stlui/user/allowance_request.html%20target=%22allowance%22').read()
    s = myfile

    #The string your looking for; always up to but not including the term your trying
    #find.
    planAllow = """Plan Allowance (MB)</td><td style="border-width:0px;"> """
    allowRemain = """Allowance Remaining (MB)</td><td style="border-width:0px;">"""
    percentRemain = """Allowance Remaining (%)</td><td style="border-width:0px;">"""
    refillTime = """Time Until Allowance Refill</td><td style="border-width:0px;">"""

    #slice out the value your looking for using the index function
    planAllowance = s[s.index(planAllow)+len(planAllow):s.index(planAllow)+len(planAllow)+4]
    allowanceRemain = s[s.index(allowRemain)+len(allowRemain):s.index(allowRemain)+len(allowRemain)+5]
    percentRemaining = s[s.index(percentRemain)+len(percentRemain):s.index(percentRemain)+len(percentRemain)+4]
    timeTorefill = s[s.index(refillTime)+len(refillTime):s.index(refillTime)+len(refillTime)+10]

    print "Plan Allowance (MB):        " + planAllowance
    print "Allowance Remaining (MB):  " + allowanceRemain
    print "Percentage Remaining:      " + percentRemaining+"%"
    print "Time Untill Refill :       " + timeTorefill
aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 
rawstr = r"""Time Until Allowance Refill</td><td style="border-width:0px;">\s*(?P<Refill_Time>\d+)\D"""

I have the above line that is searching for a time (10:00:05) in a document. The above line returns only "10" What do I need to change so it will return the complete time from the file? ie 10:00:05
Thanks
Frank

aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 

I do not know re so well, but I prefer partition for this kind of things. I use _ as variable name for the things I do not care about:

#!/usr/bin/env python
import time
import os
myfile = """<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head>
<meta http-equiv="refresh" Content="60;url=/stlui/user/allowance_request.html">
<title>Download Allowance Status</title></head><body><h3 style="font-size: 150%; color:blue; text-align:center; font-weight:bold">DOWNLOAD ALLOWANCE STATUS</h3><TABLE width='100%' cellpadding='10'><tr><td style="text-align:center;"><h3>Usage within allowance - no download restrictions</h3></td><td><span style="display:block;text-align:center">100%</span><table style="font-size:1; background-color:white; border:1px solid black;" width='15%' cellspacing='0' cellpadding='0' align='center'><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr><tr style="background-color:green"><td>&nbsp;</td></tr></table><span style="display:block;text-align:center">0%</span></td></tr><tr><td>&nbsp;</td><td><table style="font-size:18; border-width:0px;" width=90% cellspacing=0 cellpadding=0 ALIGN=CENTER><tr><td>&nbsp;</td><td style="border-width:0px;">Plan Allowance (MB)</td><td style="border-width:0px;"> 625</td></tr><tr><td>&nbsp;</td><td style="border-width:0px;">Allowance Remaining (MB)</td><td style="border-width:0px;"> 625</td></tr><tr><td><span style="background-color: green;">&nbsp;&nbsp;&nbsp;</span>&nbsp;</td><td style="border-width:0px;">Allowance Remaining (%)</td><td style="border-width:0px;"> 100</td></tr><tr><td>&nbsp;</td><td style="border-width:0px;">Time Until Allowance Refill</td><td style="border-width:0px;"> 22:18:01</td></tr></table></td></tr></TABLE>
</body>
</html>"""
s = myfile

#The string your looking for; always up to but not including the term your trying
#find.
planAllow = """Plan Allowance (MB)</td><td style="border-width:0px;"> """
allowRemain = """Allowance Remaining (MB)</td><td style="border-width:0px;">"""
percentRemain = """Allowance Remaining (%)</td><td style="border-width:0px;">"""
refillTime = """Time Until Allowance Refill</td><td style="border-width:0px;">"""

#slice out the value your looking for using the index function
_,_,planAllowance = s.partition(planAllow)
planAllowance,_,_ = planAllowance.partition('</td>')
_,_,allowanceRemain = s.partition(allowRemain)
allowanceRemain,_,_  = allowanceRemain.partition('</td>')
_,_,percentRemaining = s.partition(percentRemain)
percentRemaining,_,_  = percentRemaining.partition('</td>')
_,_,timeTorefill = s.partition(refillTime)
timeTorefill,_,_  = timeTorefill.partition('</td>')

print "Plan Allowance (MB):        " + planAllowance
print "Allowance Remaining (MB):  " + allowanceRemain
print "Percentage Remaining:      " + percentRemaining+"%"
print "Time Untill Refill :       " + timeTorefill
pyTony
pyMod
Moderator
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
 

If you stick with the regular expressions approach, the following should return the full time (hh:mm:ss).

rawstr = r"""Time Until Allowance Refill</td><td style="border-width:0px;">\s*(?P<Refill_Time>\d+:\d+:\d+)\D"""


In regular expressions \d+ means "one or more consecutive digits", so \d+:\d+:\d+)\D means "one or more consecutive digits followed by a colon followed by ... etc." and the \D represents any character that is not a digit. Only the portion matched by the pattern between the parentheses belongs to the group which gives you the data you want (hopefully).

d5e5
Practically a Posting Shark
810 posts since Sep 2009
Reputation Points: 159
Solved Threads: 159
 

d5e5 and tonyjv,
Thanks to both of you.Was working on two different scripts and neither one would work correctly. Now both of them work. Life is good.
If I wasn't almost 70 years old I would certainly go back to school and start over.:)
Frank

aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 

There was no offense taken lol. When I tell people where I'm from, their usually shocked that I'm smart enough to write code lol.

Anyway:
Seems like the problem has been solved.
I would have submitted this eairler; but this site had some down time. Anyone else notice this?

I added some comments at the top; good idea to keep it documented, even on personal projects.

I used strip to get rid of the trailing '<'.
That works as long as its always a three digit number your looking for.

You could replace it with a for loop:

planAllowance = ''
for i in pA:
 if i == '<':
    break
 else:
   planAlloance += i

Where pA is the data you sliced out.

Complete code w/out for loop:

#Plan stat update
#Written: aframe
#Date: 
#Version:


#!/usr/bin/env python
import time
import os
from urllib import urlopen
while True:
    print " " 
    print "Updating Plan Status........" 
    time.sleep(60)
    os.system("clear")
    
    myfile = open('t.html','r').read()#Change back to url address like you had before

    s = myfile

    #The string your looking for; always up to but not including the term your trying
    #find.
    planAllow = """Plan Allowance (MB)</td><td style="border-width:0px;"> """
    allowRemain = """Allowance Remaining (MB)</td><td style="border-width:0px;">"""
    percentRemain = """Allowance Remaining (%)</td><td style="border-width:0px;">"""
    refillTime = """Time Until Allowance Refill</td><td style="border-width:0px;">"""

    #slice out the value your looking for using the index function
    planAllowance = s[s.index(planAllow)+len(planAllow):s.index(planAllow)+len(planAllow)+4]
    allowanceRemain = s[s.index(allowRemain)+len(allowRemain):s.index(allowRemain)+len(allowRemain)+5]
    percentRemaining = s[s.index(percentRemain)+len(percentRemain):s.index(percentRemain)+len(percentRemain)+4]
    timeTorefill = s[s.index(refillTime)+len(refillTime):s.index(refillTime)+len(refillTime)+10]

    print "Plan Allowance (MB):        " + planAllowance.strip('<')
    print "Allowance Remaining (MB):  " + allowanceRemain.strip('<')
    print "Percentage Remaining:      " + percentRemaining+"%"
    print "Time Untill Refill :       " + timeTorefill.strip('<')
    #break /*used for testing


hellboundhackers.org offer programming challenges that deal with webpages and parsing for data. That's where I learned most of theonline programming I know.

Tech B
Posting Whiz in Training
268 posts since May 2009
Reputation Points: 59
Solved Threads: 33
 

Yes, this site was down part of yesterday afternoon. Since this thread hasn't been marked solved yet, here's one more approach:

#!/usr/bin/env python
from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):
    labels = []
    values = []
    save_next = False
    def handle_data(self, *args):
        s, = args
        if MyHTMLParser.save_next:
            MyHTMLParser.values.append(s)
            MyHTMLParser.save_next = False

        if str(args).find("Allowance") > 0:
            MyHTMLParser.labels.append(s)
            MyHTMLParser.save_next = True

#Assign web page contents to htmlstring (Snipped from post to save space)        
htmlstring = """<Please paste webpage content here>"""

h = MyHTMLParser()
h.feed(htmlstring)
h.close()

for i in range(1, 5):
    print "%30s:%15s" % (h.labels[i], h.values[i])
d5e5
Practically a Posting Shark
810 posts since Sep 2009
Reputation Points: 159
Solved Threads: 159
 

This is the code I finally came up with, thanks everyone for your help. This runs and outputs to a Tkinter window. The problem I am having now is with the while loop with sleep 60. It freezes the Tkinter window and I cannot do anythig with it while it is in sleep. Is there an alternate or better way to do the loop so it doen not freeze everything?
Frank

#!/usr/bin/env python
#HnFapMeter
#written: Frank
#Date: 06/10/2010
#Version 1.0.0

from Tkinter import *
import time
import os
import re
from urllib import urlopen
#######Build a Tkinter window
root = Tk()
listbox = Listbox(root, bg = 'green')
listbox.pack()
label = Label(root, text = 'HnFapMeter')
label.pack()
##################Create a loop to update information every minute
while True:
    time.sleep(60)
################## Get information from Modem
    #
    matchstr = urlopen('http://192.168.0.1/stlui/user/allowance_request.html%20target=%22allowance%22').read()
    ##################Get Total MB for Plan
    #
    rawstr = r"""Plan Allowance \(MB\)</td><td style="border-width:0px;">\s*(?P<Plan_Allowance>\d+)\D"""
    # method 1: using a compile object
    compile_obj = re.compile(rawstr)
    match_obj = compile_obj.search(matchstr)
    # Retrieve group(s) from match_obj
    all_groups = match_obj.groups()
    # Retrieve group(s) by name
    Plan_Allowance = match_obj.group('Plan_Allowance')
    #
    ###############Get MB remaining Information 
    #
    rawstr = r"""Allowance Remaining \(MB\)</td><td style="border-width:0px;">\s*(?P<Allowance_Remaining>\d+)\D"""
    # method 1: using a compile object
    compile_obj = re.compile(rawstr)
    match_obj = compile_obj.search(matchstr)
    # Retrieve group(s) from match_obj
    all_groups = match_obj.groups()
    # Retrieve group(s) by name
    Allowance_Remaining = match_obj.group('Allowance_Remaining')
    #
    ################Get Percentage remaining information
    #
    rawstr = r"""Allowance Remaining \(%\)</td><td style="border-width:0px;">\s*(?P<Percentage_Remaining>\d+)\D"""
    # method 1: using a compile object
    compile_obj = re.compile(rawstr)
    match_obj = compile_obj.search(matchstr)
    # Retrieve group(s) from match_obj
    all_groups = match_obj.groups()
    # Retrieve group(s) by name
    Percentage_Remaining = match_obj.group('Percentage_Remaining')
    #
    ################Get Time untill bucket refills
    #
    rawstr = r"""Time Until Allowance Refill</td><td style="border-width:0px;">\s*(?P<Refill_Time>\d+:\d+:\d+)\D"""
    # method 1: using a compile object
    compile_obj = re.compile(rawstr)
    match_obj = compile_obj.search(matchstr)
    # Retrieve group(s) from match_obj
    all_groups = match_obj.groups()
    # Retrieve group(s) by name
    Refill_Time = match_obj.group('Refill_Time')
    #########################################################################
    #print "Plan Allowance is:       " + Plan_Allowance+"(MB)"
    #print "Allowance Remaining is:  " + Allowance_Remaining+"(MB)"
    #print "Percentage Remaining is: " + Percentage_Remaining+"%"
    #print "Time Untill Refill is    " +  Refill_Time
    #########################################################################
    #
    ################# Insert information into the Tkinter Window 
    
    root.title(Allowance_Remaining+' (MB) Remaining')
    listbox.delete(0, END)
    listbox.insert(END, Plan_Allowance+' MB Allowed')
    listbox.insert(END, Allowance_Remaining+ ' MB Remaining')
    listbox.insert(END, Percentage_Remaining+' % Remaining')
    listbox.insert(END, Refill_Time+' Untill Refill')
    #
    root.update()
aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 

I haven't figured out how to do this yet but the solution may involve something called Fork, which is a way of starting up a copy of your program that knows that it is a copy (i.e. not the parent) and can do something independently and somehow communicate back to the parent that the task has been accomplished. It couldn't hurt to read this short python fork example

d5e5
Practically a Posting Shark
810 posts since Sep 2009
Reputation Points: 159
Solved Threads: 159
 

Nice to see you doing progress.

By the way, did you see my function for picking up piece between separators, which I posted reacently?

I would use after in label you have and root.mainloop():

#!/usr/bin/env python
#HnFapMeter
#written: Frank
#Date: 06/10/2010
#Version 1.0.0

from Tkinter import *
import time
import os
import re
from urllib import urlopen

def action(n=0):
    print n, ## debug
    matchstr = urlopen('http://192.168.0.1/stlui/user/allowance_request.html%20target=%22allowance%22').read()
    ## This drawing of boxes style is not recommended !!! Use shorter comments, empty lines

    ##################Get Total MB for Plan
    #
    if 'Allowance' in matchstr:
        rawstr = r"""Plan Allowance \(MB\)</td><td style="border-width:0px;">\s*(?P<Plan_Allowance>\d+)\D"""
        # method 1: using a compile object
        compile_obj = re.compile(rawstr)
        match_obj = compile_obj.search(matchstr)
        # Retrieve group(s) from match_obj
        all_groups = match_obj.groups()
        # Retrieve group(s) by name
        Plan_Allowance = match_obj.group('Plan_Allowance')
        #
        ###############Get MB remaining Information 
        #
        rawstr = r"""Allowance Remaining \(MB\)</td><td style="border-width:0px;">\s*(?P<Allowance_Remaining>\d+)\D"""
        # method 1: using a compile object
        compile_obj = re.compile(rawstr)
        match_obj = compile_obj.search(matchstr)
        # Retrieve group(s) from match_obj
        all_groups = match_obj.groups()
        # Retrieve group(s) by name
        Allowance_Remaining = match_obj.group('Allowance_Remaining')
        #
        ################Get Percentage remaining information
        #
        rawstr = r"""Allowance Remaining \(%\)</td><td style="border-width:0px;">\s*(?P<Percentage_Remaining>\d+)\D"""
        # method 1: using a compile object
        compile_obj = re.compile(rawstr)
        match_obj = compile_obj.search(matchstr)
        # Retrieve group(s) from match_obj
        all_groups = match_obj.groups()
        # Retrieve group(s) by name
        Percentage_Remaining = match_obj.group('Percentage_Remaining')
        #
        ################Get Time untill bucket refills
        #
        rawstr = r"""Time Until Allowance Refill</td><td style="border-width:0px;">\s*(?P<Refill_Time>\d+:\d+:\d+)\D"""
        # method 1: using a compile object
        compile_obj = re.compile(rawstr)
        match_obj = compile_obj.search(matchstr)
        # Retrieve group(s) from match_obj
        all_groups = match_obj.groups()
        # Retrieve group(s) by name
        Refill_Time = match_obj.group('Refill_Time')
    
        root.title(Allowance_Remaining+' (MB) Remaining')
        listbox.delete(0, END)
        listbox.insert(END, Plan_Allowance+' MB Allowed')
        listbox.insert(END, Allowance_Remaining+ ' MB Remaining')
        listbox.insert(END, Percentage_Remaining+' % Remaining')
        listbox.insert(END, Refill_Time+' Untill Refill')

    label.after(6000,action,n+1) ## next loop
        

#######Build a Tkinter window
root = Tk()
listbox = Listbox(root, bg = 'green')
listbox.pack()
label = Label(root, text = 'HnFapMeter')
label.pack()
action()
##################Create a loop to update information every minute
##while True:                ### Change ot  root.mainloop() That is your main loop
##    time.sleep(60)

root.mainloop()
pyTony
pyMod
Moderator
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
 

Nice to see you doing progress.

By the way, did you see my function for picking up piece between separators, which I posted reacently?

I would use after in label you have and root.mainloop():

Wow! This works great!! Thanks.
How hard would it be to change the background color from green to red based on MB remaining in Allowance_Remaining?
Thanks for everyone's help.
Frank

aframe
Newbie Poster
22 posts since May 2010
Reputation Points: 11
Solved Threads: 0
 

Define global warning limit allowance_limit
Add this before label.after(..

listbox.config(bg = 'green' if Allowance_Remaining > allowance_limit else 'red')
    listbox.update()
pyTony
pyMod
Moderator
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You