PDA

View Full Version : Rendered SourceCode from website



Ini
23rd January 2016, 12:28
Hello,

with QNetworkreply and readall() i can get the orginalsourcecode of a URL.

But i need the rendered sourceCode which u can view with eg firebug. IS there any way to do that with qt?

thx

anda_skoa
23rd January 2016, 13:17
Can you rephrase that?

What is a "rendered source code"?

Source code that is rendered syntax highlighted?
Or do you mean the output of some rendering engine given a certain source code?

Cheers,
_

Ini
23rd January 2016, 13:20
modern websites are mosty scripted with javascript which adds content dynamically. When u click in your browser in such a website right-click view sourcecode it shows you the orginal-sourcode. When you view a website in firebug you see the "rendered" sourcecode with the added divs and stuff.

I want the rendered sourcecode but i didn't find something in qt or c++ to achieve this. Can you help me there?

anda_skoa
23rd January 2016, 13:50
You could try loading the web content into a QWebView and then using QWebFrame::toHtml() to get the string representation of the frame's DOM tree.

Cheers,
_

Ini
23rd January 2016, 14:12
Thats what i try atm but in my QT #include QWebpage is not available.

I did the QT += webkitwidgets in the pro file. So i cannot do that. Do you now how to fix this?

'QWebpage' was not declared in this scope

genereally that sounds good but i hope you can help me make QWebpage work. I really dont know what to do now. I did what I need to do to implement it and i get an error...

my version 3.6.0

Added after 8 minutes:

I did it now without the QWebpage variable but the srourcestring returns nothing except the basic html tree. The website is shown in the widgets. Whats the problem there? i never used that so far.


Debugcode:

qt.network.ssl: QSslSocket: cannot resolve TLSv1_1_client_method
qt.network.ssl: QSslSocket: cannot resolve TLSv1_2_client_method
qt.network.ssl: QSslSocket: cannot resolve TLSv1_1_server_method
qt.network.ssl: QSslSocket: cannot resolve TLSv1_2_server_method
qt.network.ssl: QSslSocket: cannot resolve SSL_select_next_proto
qt.network.ssl: QSslSocket: cannot resolve SSL_CTX_set_next_proto_select_cb
qt.network.ssl: QSslSocket: cannot resolve SSL_get0_next_proto_negotiated
"<html><head></head><body></body></html>"
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
DirectShowPlayerService::doRender: Unresolved error code 80040218
DirectShowPlayerService::doRender: Unresolved error code 80040218
DirectShowPlayerService::doRender: Unresolved error code 80040218
DirectShowPlayerService::doRender: Unresolved error code 80040218
DirectShowPlayerService::doRender: Unresolved error code 80040218
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
DirectShowPlayerService::doRender: Unresolved error code 80040218
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
ERROR: Font maps SPACE and ZERO WIDTH SPACE to the same glyph. Glyph width will not be overridden.
platform\graphics\SimpleFontData.cpp(146) : void WebCore::SimpleFontData::platformGlyphInit()
ERROR: Font maps SPACE and ZERO WIDTH SPACE to the same glyph. Glyph width will not be overridden.
platform\graphics\SimpleFontData.cpp(146) : void WebCore::SimpleFontData::platformGlyphInit()
LEAK: 6 JSLazyEventListener
LEAK: 599 RenderObject
LEAK: 2 Page
LEAK: 7 Frame
LEAK: 91 CachedResource
LEAK: 1709 WebCoreNode





sourcecode:

#include "mainwindow.h"
#include "ui_mainwindow.h"
#include <QtWebKitWidgets>
#include <QWebPage>
#include <QWebFrame>
#include <QDebug>

MainWindow::MainWindow(QWidget *parent) :
QMainWindow(parent),
ui(new Ui::MainWindow)
{
ui->setupUi(this);

QWebView *view = new QWebView(parent);
view->load(QUrl("http://www.qt.io/"));
view->show();
//QWebpage *page = view->page();
QWebFrame *frame = view->page()->mainFrame();
QString string = frame->toHtml();
qDebug() << string;

}

MainWindow::~MainWindow()
{
delete ui;
}

anda_skoa
23rd January 2016, 14:47
You are trying to acccess the content before it is loaded.

See QWebView::loadFinished().
And if there are scripts running after load you might need to delay even further until they are done.

Cheers,
_

Ini
23rd January 2016, 15:31
And how do I delay even further?

anda_skoa
23rd January 2016, 16:01
For example using a timer.
Or maybe the web content has some way of knowing when it is done.

Cheers,
_

Ini
24th January 2016, 14:53
i dont know it does not work even with the connect. Can you help me?
qDebug() << string; in the code below does just nothing not even "" as debug. how can that be?

mainwindow.cpp


#include "mainwindow.h"
#include "ui_mainwindow.h"
#include <QWebPage>
#include <QWebFrame>
#include <QDebug>

MainWindow::MainWindow(QWidget *parent) :
QMainWindow(parent),
ui(new Ui::MainWindow)
{
ui->setupUi(this);

view = new QWebView(parent);
connect(view,SIGNAL(loadFinished(bool)),this,SLOT( contentAvailable()));
view->load(QUrl("http://www.qt.io/"));
//view->show();

//QWebpage *page = view->page();
}

MainWindow::~MainWindow()
{
delete ui;
}

void MainWindow::contentAvailable()
{
qDebug() << "avail";
QWebFrame *frame = view->page()->mainFrame();
QString string = frame->toHtml();
qDebug() << string;
}




mainwindow.h


#ifndef MAINWINDOW_H
#define MAINWINDOW_H

#include <QMainWindow>
#include <QtWebKitWidgets>

namespace Ui {
class MainWindow;
}

class MainWindow : public QMainWindow
{
Q_OBJECT

public:
explicit MainWindow(QWidget *parent = 0);
~MainWindow();

QWebView *view = NULL;

private:
Ui::MainWindow *ui;

private slots:
void contentAvailable();
};

#endif // MAINWINDOW_H






output/debug:

qt.network.ssl: QSslSocket: cannot resolve TLSv1_1_client_method
qt.network.ssl: QSslSocket: cannot resolve TLSv1_2_client_method
qt.network.ssl: QSslSocket: cannot resolve TLSv1_1_server_method
qt.network.ssl: QSslSocket: cannot resolve TLSv1_2_server_method
qt.network.ssl: QSslSocket: cannot resolve SSL_select_next_proto
qt.network.ssl: QSslSocket: cannot resolve SSL_CTX_set_next_proto_select_cb
qt.network.ssl: QSslSocket: cannot resolve SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
DirectShowPlayerService::doRender: Unresolved error code 80040218
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
DirectShowPlayerService::doRender: Unresolved error code 80040218
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
DirectShowPlayerService::doRender: Unresolved error code 80040218
DirectShowPlayerService::doRender: Unresolved error code 80040218
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
DirectShowPlayerService::doRender: Unresolved error code 80040218
avail
DirectShowPlayerService::doRender: Unresolved error code 80040218
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
avail
avail
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
avail
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
qt.network.ssl: QSslSocket: cannot call unresolved function SSL_get0_next_proto_negotiated
avail
ERROR: Font maps SPACE and ZERO WIDTH SPACE to the same glyph. Glyph width will not be overridden.
platform\graphics\SimpleFontData.cpp(146) : void WebCore::SimpleFontData::platformGlyphInit()
ERROR: Font maps SPACE and ZERO WIDTH SPACE to the same glyph. Glyph width will not be overridden.
platform\graphics\SimpleFontData.cpp(146) : void WebCore::SimpleFontData::platformGlyphInit()
LEAK: 6 JSLazyEventListener
LEAK: 504 RenderObject
LEAK: 2 Page
LEAK: 7 Frame
LEAK: 90 CachedResource
LEAK: 1709 WebCoreNode

anda_skoa
24th January 2016, 16:45
The output also doesn't show "avail".

Do you get output from other qDebug() statements?

Cheers,
_

P.S.: use
tags for posting code

Ini
25th January 2016, 00:52
what do you mean avail is shown in the output?

i will next time use code statement. thx for advice

anda_skoa
25th January 2016, 09:06
what do you mean avail is shown in the output?

I wrote that it is not shown.
Meaning that either qDebug() output is not shown in our output at all or that the slot was not called.

But I see that I have overlooked it, it is well hidden in all kinds of unrelated stuff.

What if you output some text together with the string, e.g. something like


qDebug() << "html code:" << string;

Btw, "string" is a very bad variable name, the C++ Standard class for strings is called "string".

In any case it looks like you need to do some debugging.

Cheers,
_

Ini
25th January 2016, 14:43
qDebug() << frame->toHtml();

also debugs nothing. I dont know what to do. Seems like libarys not working correct. Or ist something wrong with my code?

anda_skoa
25th January 2016, 15:12
That debug statement ist missing the fixed text to see if it really happens.

Also you could get the QString from toHtml() into a variable and then output the string's isNull() and isEmpty() values.

Also, trigger the slot from a button instead of the loading signal to see if you get output when you can visually confirm loading is done.

Cheers,
_

Ini
6th February 2016, 00:45
Tried all that. Same Problem still there.
Problem with toHtml was qDebug is buggy there. Cause you can to toHtml in a playintextedit but not in debug. Only sometimes not always, must depend on some characters or so.

Is there no way to get the rendered sourcecode?

Every function currentFrame and mainFrame returns orginalsourcecode you can try yourself on a website that uses JS. Is there no way?

Added after 1 21 minutes:

Ok to make it 100% clear what i mean:
There is a project called "Tab Browser" in the preinstalled ones for Qt.
In that there is a button view->pagesource: thats not the code i need
I need the code from Tools->Enable Web Inspector then rightclick on a element on the site and inspect. Thats the code i need.

So there is a way. Can somebody introduce me how it works?

anda_skoa
6th February 2016, 09:12
Hmm.
Have you tried taking the QWebFrame::documentElement() and calling either toInnerXml() or toOuterXml() or traversing the tree?

Cheers,
_

Ini
6th February 2016, 13:30
As i said webframe has the wrong code in it i need the code like in the inpector in the frame. how can i get that code?

anda_skoa
6th February 2016, 17:16
Which is why I suggested trying the QWebElement tree.

Cheers,
_

Ini
6th February 2016, 21:15
I tried that before, with exact same result as i got now for this --> orgialsourcecode

do you know how the inspector gets to that code?

Added after 1 43 minutes:

Ok problem was currentFrame() and mainFrame() does not get childiframes-sourcodes. How can somebody know that? Should be added to the documentation... Is there something to turn that on? Turn on load iframes?

i need that cause one iframe sends something onLoad. And if it never loads cannot send it