PDA

View Full Version : QtWebKit: Can’t load full html code from page



soloma_lviv
15th April 2012, 12:48
If we look at a plain source code of a website we will see that all ads, or most of them (flash, Google, others) are inserted as a JavaScript code. But if you look at the code in for example Firefox Firebug you will see that the JavaScript have been replaced with the HTML code of the add.
I want to load and parse this “full” html and I believed that Qt WebKit can do such stuff.

I tried to do it in that way:


PageLoader::PageLoader(const QUrl &url)
{
mWebPage = new QWebPage();
mWebPage->settings()->setAttribute(QWebSettings::JavascriptEnabled, true);
mWebPage->settings()->setAttribute(QWebSettings::PluginsEnabled, false);
mWebPage->settings()->setAttribute(QWebSettings::AutoLoadImages, false);
mWebPage->settings()->setAttribute(QWebSettings::JavascriptCanOpenWindow s, false);
connect(mWebPage->mainFrame(),SIGNAL(loadFinished(bool)), this, SLOT(processPage()));
mWebPage->currentFrame()->load(url);
}

void PageLoader::processPage()
{
QWebFrame* frame = mWebPage->currentFrame();
QString webHtml = frame->toHtml();
QFile file("/home/ostap/output.txt");
file.open(QIODevice::WriteOnly | QIODevice::Text);
QTextStream out(&file);
out << webHtml;
emit finished();
}


But in output file I have only plain html with links to *.js files in script tags.

Where is my problem?