View Full Version : How do I work with HTML that is not valid XML

19th February 2014, 02:18
Hi all,

Sorry if this was already answered, I wasn't able to find good solution after few days of research.

How do I find some elements in HTML loaded as QString, taking into account that HTML is not valid xml.
I.e. some of its tags do not have corresponding closing tags:

pre.debug {
white-space: pre-wrap;
width: 90%;
overflow: hidden;
<link href="//fonts.googleapis.com/css?family=Open+Sans:300,400&lang=en" rel="stylesheet" type="text/css">
.banner {
text-align: center;

In the example above QDomDocument::elementsByTagName() fails to return me <style> element that follows <link> element.
I assume this is because <link> isn't closed properly.

How do I address this with smallest effort?

Thanks a lot in advance!

19th February 2014, 09:17
You could try loading it into a QWebPage and use its API to access the internal DOM structure.