Hi,

Im basically from QA. I have a piece of code in python that parses HTML pages to reach to a specific link on a webpage. The below code gets me the url of the link which needs to be clicked. What code should i write to simulate clicking a link.

URL = "http://11.12.13.27:8080/cruisecontrol"
BRANCH_FILE = "b.txt"

from urllib2 import urlopen
from HTMLParser import HTMLParser

import re
destination_re = re.compile(r"Test 5")

def read_branch():
    return open(BRANCH_FILE).read().strip()

def get_links(url):
    parser = MyHTMLParser()
    parser.feed(urlopen(url).read())
    parser.close()
    return parser.links

# Build url for Deploy page
def get_deploy_url():
    branch = read_branch().lower()
    url = URL + "/buildresults/abc_%s_nightly_build" % branch
    for link in get_links(url):
        if link["href"].startswith("Deploy"):
            return "%s/%s" % (URL, link["href"])

# Build url for Destination page
def get_destination_url():
    url = get_deploy_url()
    print url
    for link in get_links(url):
        if destination_re.search(link["href"]):
            return "http://11.12.13.27:8080/cruisecontrol/" + link["href"]


class MyHTMLParser(HTMLParser):
    def __init__(self, *args, **kwd):
        HTMLParser.__init__(self, *args, **kwd)
        self.links = []

    def handle_starttag(self, tag, attrs):
        if tag == "a":
            attrs = dict(attrs)
            if "href" in attrs:
                self.links.append(dict(attrs))

    def handle_endtag(self, tag):
        pass

if __name__ == "__main__":
    final_url = get_destination_url()
    if final_url is None:
        print "Could not find a destination to deploy"
    else:
        print final_url

Below is the output i get
http://11.12.13.27:8080/cruisecontrol/DeploySelect.jsp?ArtifactsUrl=artifacts/abc_b_nightly_build/20100506124231

http://11.12.13.27:8080/cruisecontrol/DeployConfig.jsp?name=Test 5&scriptPath=xxxxxxxxxxx


Actually i have a need to simulate clicking Test 5 link on the deployment webpage url. Please help.

Recommended Answers

All 6 Replies

From your links I only get error from my browser, but this

from urllib2 import urlopen

already looks right for me.

Does it not work the way you expect?

Check the module's documentation:
http://docs.python.org/library/urllib2.html

Did you read the manual page

urllib2.urlopen(url[, data][, timeout])¶

    Open the URL url, which can be either a string or a Request object.

d

Hi tonyvj,
Sorry i am from QA and do not have much programming knowledge. I already have this piece of code. Can you please explain and give me a possible solution. Kindly help.

Did you read the manual page

urllib2.urlopen(url[, data][, timeout])¶

    Open the URL url, which can be either a string or a Request object.

d

What I think he is trying to say is, opening the page == clicking link.

What happens then, when you do, for example:

print urlopen(final_url).read()
open('fromnet.dat','w').write(urlopen(final_url).read())

(just guessing never done like this before, urlopen object should be possible to use same way as file object.)

Please give info what you tried and what errors you got. I have not any idea of Questions and Answers (QA),

I think those final urls are not OK, which you gave as they have '...' in them

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.