PDA

View Full Version : Inspecting HTML elements in QWebEngine vs Chrome, Firefox, etc



Thomas1337
26th January 2017, 16:10
Hello, when accessing the html of an element on a webpage, QWebEngine gives different results to the "Inspect Element" dev tool in Chrome or Firefox. The Qt functions http://doc.qt.io/qt-5/qwebenginepage.html#runJavaScript-2 and http://doc.qt.io/qt-5/qwebenginepage.html#toHtml both seem to return the html before Javascript has run on the page, whereas Chrome & Firefox return it after.

Example:

Down the side of a youtube page eg https://www.youtube.com/watch?v=1nydxbGhgv8 is a list of similar videos with thumbnails. The brains at google store the image url for these thumbnails in a HTML5 attribute data-thumb, and swap this url into the <img src= using Javascript in the user's browser.

When you inspect these thumbnails in Chrome & Firefox, you see a nice clean <img> tag, pointing to the correct source. You get the same thing when accessing the html of these thumbnails with JavaScript: var elems = $( "span.yt-uix-simple-thumb-wrap.yt-uix-simple-thumb-related:first" ); elems.html();.

12301

In Qt with a QWebEngineView, when you do page()->toHtml(), you get the <img> tag before the source has been swapped using JavaScript. Similarly, when you load the jQuery library and then run:


QString code = "var elems = qt.jQuery( 'span.yt-uix-simple-thumb-wrap.yt-uix-simple-thumb-related:first' ); elems.html();";
webView->page()->runJavaScript(code, [&](const QVariant &v){ showResults(v.toString()); });

... the variable v also contains the <img> tag before the source has been set.

The question

How does one view the final source code of such webpages, after all the JavaScript has been run, in the same way as the Chrome/Firefox dev tools?

anda_skoa
27th January 2017, 09:51
Maybe you are evaluating the toHtml() or runJavaScript() code too early as compared to when you are manually invoking a tool in the browsers.

I.e. it could be more an issue of timing rather than web engine behavior.

Cheers,
_

Thomas1337
27th January 2017, 13:37
Hello, thank you for that suggestion, I've done some experiments and it turns out that you are correct.

I am obviously waiting for the QWebEngineView's loadFinished signal before running any JavaScript. But my Qt browser widget is very small, so when the page loads these elements aren't visible to the user, because there isn't enough space to see them. When you scroll down and look at them, their <img src attribute gets set, and then inspecting them with JavaScript gives the same answer as in Chrome & Firefox.

So now the question is: does one of the Qt classes have a method to force all elements on the page to be rendered, as if the user has scrolled down the page to look at them?

I will have a look in the docs but if anyone knows the answer it would be much appreciated.

anda_skoa
28th January 2017, 08:09
Maybe if you are trying to "print" it, e.g. into a temporary PDF file.

Cheers,
_

Thomas1337
31st January 2017, 15:38
Here is what I've learned:
- Off screen rendering is supported by the underlying chromium embedded framework but this has not been explicitly ported to Qt
- A number of people have requested this functionality be added to QtWebEngine and coincidentally about two weeks ago it looks like they started work on it for Qt5.9 https://bugreports.qt.io/browse/QTBUG-44986
- Various hacky solutions exist, such as scrolling down the page which forces it to be rendered, and resizing the QWebEngine widget so that it shows the whole page. I had no luck in getting scrolling to trigger the rendering, and wasn't able to test pdf printing because it would require an update of my Qt version. So I've ended up resizing the widget until some future release comes along which solves this officially:



webView->page()->runJavaScript("document.documentElement.scrollWidth + '|' + document.documentElement.scrollHeight;", [=](QVariant result){
std::string widthAndHeight = result.toString().toStdString();
std::size_t idx = widthAndHeight.find("|");
int newWidth = std::stoi(widthAndHeight.substr(0,idx));
int newHeight = std::stoi(widthAndHeight.substr(idx+1));
webView->resize(newWidth, newHeight);
});