GatorAlli 0 Newbie Poster

I'm using Python, PyQt4, and QtWebKit to load a web page into a bare-bones browser to examine the data.

However, there is a small issue. I'm trying to get the contents and src of every iframe on the loaded page. I'm using webView.page().mainFrame().childFrames() to get the frames. To problem is, childFrames() loads the frames ONLY if they're visible by the browser. For example, when your browser is positioned at the top of the page, childFrames() will not load the iframes are at the footer of the page. Is there a way or setting I could tweak where I can get all ads? I've attached the source of my "browser". Try scrolling down when the page finishes it's loading. Watch the console and you will see that the iframes load dynamically. Please help.

from PyQt4 import QtGui, QtCore, QtWebKit
import sys
import unicodedata


class Sp():
    def Main(self):
        self.webView = QtWebKit.QWebView()
        self.webView.load(QtCore.QUrl("http://www.msnbc.msn.com/id/41197838/ns/us_news-environment/"))
        self.webView.show()
        QtCore.QObject.connect(self.webView,QtCore.SIGNAL("loadFinished(bool)"),self.Load)

        
    def Load(self):
        frame = self.webView.page().mainFrame()
        children = frame.childFrames()
        fT = []


        for x in children:
            print "=========================================="
            print unicodedata.normalize('NFKD', unicode(x.url().toString())).encode('ascii','ignore')
            print "=========================================="
            fT.append([unicode(x.url().toString()),unicode(x.toHtml()),[]])


        for x in range(len(fT)):
            f = children[x]
            tl = []
            for fx in f.childFrames():
                print "___________________________________________"
                print unicodedata.normalize('NFKD', unicode(fx.url().toString())).encode('ascii','ignore')
                print "___________________________________________"
                tl.append([unicode(fx.url().toString()),unicode(fx.toHtml()),[]])
            fT[x][2] = tl
     

app = QtGui.QApplication(sys.argv)
s = Sp()
s.Main()
app.exec_()
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.