PDA

View Full Version : Get request's source with QNetworkAccessManager



romka
23rd February 2015, 12:53
Hello.

I have a html-page with such source:


<html>
<head lang="en">
<script src="http://example1.com/script1.js"></script>
<script src="http://example2.com/script2.js"></script>
</head>
<body>
...
</body>
</html>

Each of included java-scripts has code like:


(function(){
var s = document.createElement('script');
var x = document.getElementsByTagName('script')[0];
s.type = 'text/javascript';
s.async = true;
s.src = ('https:'==document.location.protocol?'https://':'http://')
+ '<exampleA.com>/<scriptA>-1.js';
x.parentNode.insertBefore( s, x );
})();

It means that each of these js-files includes one more external js-file (here is <exampleA.com> should be replaced with example1.com or example2.com, and <scriptA> also should be replaced in the same way).

As a result I have on my page loaded scripts:
* http://example1.com/script1.js,
* http://example1.com/script1-1.js,
* http://example2.com/script2.js,
* http://example2.com/script2-1.js.

Now I need to load my page with QNetworkAccessManager::createRequest() but it's necessary to know who requested each resource. In my example files http://example1.com/script1.js and http://example1.com/script1-1.js were requested by source page, script http://example1.com/script1-1.js was requested by script on example1.com, script http://example2.com/script2-1.js - by script on example2.com. As a result I need to build tree like this:

source

example1.com
script1.js
script1-1.js
script1-2.js

example2.com
script2.js
script2-1.js
script2-2.js
<...>
script2-N.js

example3.com
script3.js
script3-1.js
script3-2.js
<...>
script3-N.js


Do you have any ideas how to do it?

ChrisW67
23rd February 2015, 20:32
I do not know what QNetworkAccessManager::createRequest() has to do with fetching this. QNetworkAccessManager::get() is the tool to fetch the HTML

You have four problems:

Parsing the returned HTML to extract the target of script links (or inline scripts)
A way to tell the scripts you are interested in from other scripts that will be in the document.
Fetching and executing the identified Javascript to generate the output that would be inserted in its place...in order to run it back through step 1
A data structure to hold the results.