954,525 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

PyQt QWebKit frame bug?

I'm using Python, PyQt4, and QtWebKit to load a web page into a bare-bones browser to examine the data.

However, there is a small issue. I'm trying to get the contents and src of every iframe on the loaded page. I'm using webView.page().mainFrame().childFrames() to get the frames. To problem is, childFrames() loads the frames ONLY if they're visible by the browser. For example, when your browser is positioned at the top of the page, childFrames() will not load the iframes are at the footer of the page. Is there a way or setting I could tweak where I can get all ads? I've attached the source of my "browser". Try scrolling down when the page finishes it's loading. Watch the console and you will see that the iframes load dynamically. Please help.

from PyQt4 import QtGui, QtCore, QtWebKit
import sys
import unicodedata


class Sp():
    def Main(self):
        self.webView = QtWebKit.QWebView()
        self.webView.load(QtCore.QUrl("http://www.msnbc.msn.com/id/41197838/ns/us_news-environment/"))
        self.webView.show()
        QtCore.QObject.connect(self.webView,QtCore.SIGNAL("loadFinished(bool)"),self.Load)

        
    def Load(self):
        frame = self.webView.page().mainFrame()
        children = frame.childFrames()
        fT = []


        for x in children:
            print "=========================================="
            print unicodedata.normalize('NFKD', unicode(x.url().toString())).encode('ascii','ignore')
            print "=========================================="
            fT.append([unicode(x.url().toString()),unicode(x.toHtml()),[]])


        for x in range(len(fT)):
            f = children[x]
            tl = []
            for fx in f.childFrames():
                print "___________________________________________"
                print unicodedata.normalize('NFKD', unicode(fx.url().toString())).encode('ascii','ignore')
                print "___________________________________________"
                tl.append([unicode(fx.url().toString()),unicode(fx.toHtml()),[]])
            fT[x][2] = tl
     

app = QtGui.QApplication(sys.argv)
s = Sp()
s.Main()
app.exec_()
GatorAlli
Newbie Poster
1 post since Jan 2011
Reputation Points: 10
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: