Hello, thanks for reading my post. My troubles have to do with using a python script to gather information from a web page. I'm using the 'IEC' (Internet Explorer Controller) module to handle the 'COM' stuff for me. The script looks like this:

import IEC
from BeautifulSoup import BeautifulSoup
website = 'http://www.mywebsite.com'
ie = IEC.IEController()
ie.Navigate(website)
ie.PollWhileBusy()
html = ie.GetDocumentHTML()
ie.PollWhileBusy()
soup = BeautifulSoup(html)
#...etc etc etc....

This code works fine on the computer I wrote it on, a P4 2.4GHz, but as soon as I tried it on some faster computers I get blank or incomplete html for the page unless I inject some sleep statements for several seconds between navigating and getting the html. This also happened on the slower computer when the internet/site was especially slow one day. Has anyone else had these troubles with the module or am I not using PollWhileBusy properly perhaps? Examining its code its just a function that gets a boolean returned from i.e. if it's busy and loops/waits until i.e. says it's not busy anymore.... I don't want to use arbitrary-length time.sleep statements to make this work under varying conditions, anybody have some experience/advice with this module?? Thanks for reading and any advice you can offer!!
--John

Recommended Answers

All 2 Replies

Have you tried withou using the ie.PollWhileBusy() calls?

Every example I saw doesn't use it, like the example in the module documentation:

import IEC

ie = IEC.IEController()                # Creates a new IE Window
ie.Navigate('http://www.google.com/')  # Navigate to a website.
html = ie.GetDocumentHTML()            # Get the HTML

and it also says that:

PollWhileBusy(self):

This method is mostly used by other methods in the class, but is made
public, in case a user needs it. This method polls the Internet
Explorer instance and returns if IE is NOT busy doing anything.

Thanks for the suggestion, I tryed that but it still fails... your idea got me thinking though; I think my code was failing without the sleep time because the page redirects... Internet explorer says it's done and my script immediately proceeds, meanwhile internet explorer is loading the redirect location and this is probably why I needed to give it extra time. At least, that's my theory right now. Suggestions on handling this would be greatly appreciated. Right now I think I will try cutting down the wait but putting another PollWhileBusy() call after a short wait of 1 second or so, hopefully to shave some time off the delay (now set at sleep 4 seconds) each time it runs. In the real program this is in a loop that follows hyperlinks in a tree like fashion and could easily run 100 or more loops so any seconds I can shave off each loop will be pretty significant. Again, thanks for the suggestion. Any input is appreciated.
--John

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.