Hi,
I have a code which uses HTMLParser to reach a webpage where an exe file is located. The code then downloads and installs the file on local machine.

I need a solution for the below problem:
HTMLParser reaches a webpage where there are links to clients_test2, clients_test5, clients_test8. Each of this will have the same exe file. When it reaches this page i need to give user the option to choose the test environment(clients_test2 or clients_test5 or clients_test8) from which he wants the exe file.

Currently using regex it searches for clients_test and stops at the very first link encountered and gets the exe from there.

Recommended Answers

All 5 Replies

Anybpdy has any ideas??

Do you need to interact with the user after the code is running, or do you want to accept the specification on the command line?

Interactive: Use user_answer = raw_input(prompt) Command line: use sys.argv beware that argv[0] is the name of the python script, at least on unix-like OSs

Hi griswolf,
I need to interact with the use when the code is running. Below is the code:

URL = "http://11.12.13.27:8080/cruisecontrol"
BRANCH_FILE = "branch.txt"

from urllib2 import urlopen
from HTMLParser import HTMLParser

import subprocess as sp
import re

client_re = re.compile(r"clients_test")

DOWNLOAD_DST = "C:/bosstest.exe"

#read the branch file for the branch name
def read_branch():
    return open(BRANCH_FILE).read().strip()

#Fetching links using HTMLParser
def get_links(url):
    parser = MyHTMLParser()
    parser.feed(urlopen(url).read())
    parser.close()
    return parser.links

#Build url for Artifacts page
def get_artifacts_url():
    branch = read_branch().lower()
    url = URL + "/buildresults/Poker-TTM_%s_nightly_build" % branch
    for link in get_links(url):
        if link["href"].startswith("artifacts/"):
            return "%s/%s" % (URL, link["href"])

#Build url for Clients page
def get_client_url():
    url = get_artifacts_url()
    for link in get_links(url):
        x = 0
        while x == 0 and client_re.search(link["href"]):  #Trying to do this
            print link["href"]
            userinput = raw_input('Please select \nA) xxx\nB) yyy\n').lower()
            x = 1
                        
#download bosspokercode.exe on local machine
def download(url, dst_file):
    content = urlopen(url).read()
    outfile = open(dst_file, "wb")
    outfile.write(content)
    outfile.close()

#start installation of the downloaded exe file
def install(prog):
    process = sp.Popen(prog, shell=True)
    process.wait()

#automatically downloads and installs bosspokercode.exe for a branch
def main():
    download(FINAL_URL, DOWNLOAD_DST)
    install(DOWNLOAD_DST)

#Parsing HTML pages 
class MyHTMLParser(HTMLParser):
    def __init__(self, *args, **kwd):
        HTMLParser.__init__(self, *args, **kwd)
        self.links = []

    def handle_starttag(self, tag, attrs):
        if tag == "a":
            attrs = dict(attrs)
            if "href" in attrs:
                self.links.append(dict(attrs))

    def handle_endtag(self, tag):
        pass


if __name__ == "__main__":
    url = get_client_url()
    if url is None:
       print "Could not find client"
    else:
       print get_links(url)
       FINAL_URL = url +"/bosspokercode.exe"
       print FINAL_URL
       main()

Currently for the branch i am getting two links clients_test5 and clients_test8. But these are in the link["href"]. How do i send them to the user??

This is what i have in get_client_url function now. But this just takes the 1st clients_test link.

#Build url for Clients page
def get_client_url():
    url = get_artifacts_url()
    for link in get_links(url):
        #if link["href"].find("/clients_test") > 0:
        if client_re.search(link["href"]):
            return "http://10.47.42.27:8080" + link["href"]

Hi,
Anyone has any ideas?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.