Results 1 to 4 of 4

Thread: Is there a clear way to parse HTML in Qt 5.7

  1. #1
    Join Date
    Dec 2016
    Posts
    3
    Qt products
    Qt5
    Platforms
    Unix/X11 Windows

    Default Is there a clear way to parse HTML in Qt 5.7

    Hi,

    I would like a powerfull HTML parser working with Qt C++ (I'm now with Qt 5.7). I'm really tired of reading a lot of articles, but without finding a clair and recent Parcer.
    I found libxml2 v2.9.4 but clear examples are rare. Also, I readed about QtWebKit but it's not supported with Qt 5.7 as I understand.

    I'm an amateur programmer with VB.NET in that I can use the good "HTML Agility Pack".

    What I want is a parser that:
    • working in windows and linux OS.
    • supporting at list HTML4 (HTML5 can be perfect).
    • don't need a control or a viewer to work.
    • having simple tutorials or examples.


    I found also QXmlQuery, And I want to know if is it a good HTML parser.

    Really, I'm tired of looking more.
    Thank you.

  2. #2
    Join Date
    Jan 2006
    Location
    Graz, Austria
    Posts
    8,419
    Thanks
    37
    Thanked 1,546 Times in 1,496 Posts
    Qt products
    Qt3 Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: Is there a clear way to parse HTML in Qt 5.7

    XML parser are usually not viable because most Web content is unfortunately malformed and not valid XML.

    Basically the only way to parse real life web content is a browser engine, because they have all sorts of work arounds for broken content.

    In the case of Qt that would currently be QtWebEngine

    User ayanda83 does quite a lot with it, check out his threads http://www.qtcentre.org/search.php?searchid=6994657

    Cheers,
    _

  3. #3
    Join Date
    Dec 2016
    Posts
    3
    Qt products
    Qt5
    Platforms
    Unix/X11 Windows

    Default Re: Is there a clear way to parse HTML in Qt 5.7

    Thank you Mr anda_skoa.

    Almost all the time, I work with local HTML documents where I retrieve or I remove Tags or any other content.
    So, rendering the HTML documents then processing with them is not a purpose for me.

  4. #4
    Join Date
    Mar 2009
    Location
    Brisbane, Australia
    Posts
    7,677
    Thanks
    13
    Thanked 1,596 Times in 1,524 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows
    Wiki edits
    17

    Default Re: Is there a clear way to parse HTML in Qt 5.7

    Have you considered various scripting languages like Python with Beautiful Soup? Might be a better choice for mass file manipulation.

Similar Threads

  1. How can I parse an HTML file using Qt
    By ayanda83 in forum Qt Programming
    Replies: 4
    Last Post: 10th December 2016, 20:57
  2. How to parse html with QWebElement?
    By orkto in forum Newbie
    Replies: 1
    Last Post: 25th December 2014, 22:33
  3. QTextEdit::toHtml() - get clear html
    By folibis in forum Qt Programming
    Replies: 1
    Last Post: 26th November 2013, 00:52
  4. Parse RSS into html and display it in QWebView.
    By halvors in forum Qt Programming
    Replies: 11
    Last Post: 14th August 2010, 13:44
  5. Best way to load and parse an HTML file ??
    By tuthmosis in forum Qt Programming
    Replies: 8
    Last Post: 23rd August 2008, 12:06

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.