Results 1 to 8 of 8

Thread: Need a QT class to handle HTML documents...

  1. #1
    Join Date
    Jul 2008
    Posts
    12
    Qt products
    Qt4
    Platforms
    Windows

    Default Need a QT class to handle HTML documents...

    I thought QDomDocument could do the job but it just can't handle HTML... at least i cannot make it work.

    Any ideas?

    Thanks

  2. #2
    Join Date
    Apr 2010
    Posts
    769
    Thanks
    1
    Thanked 94 Times in 86 Posts
    Qt products
    Qt3 Qt4
    Platforms
    Unix/X11

    Default Re: Need a QT class to handle HTML documents...

    What is the job? QDomDocument is built to work with XML. HTML, in general, is not well-formed XML, and XML parsers will generally choke on it

  3. #3
    Join Date
    Jul 2008
    Posts
    12
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Need a QT class to handle HTML documents...

    Our goal is to load a web page containing a table displaying data we must import in a MySQL database.

    The web page and the table will never change in their structure.

  4. #4
    Join Date
    Oct 2006
    Location
    New Delhi, India
    Posts
    2,467
    Thanks
    8
    Thanked 334 Times in 317 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Need a QT class to handle HTML documents...

    Will
    QTextEdit::setHtml or QWebView::setHtml be of some use to you ?

  5. #5
    Join Date
    Jul 2008
    Posts
    12
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Need a QT class to handle HTML documents...

    I should have point out that this extraction process has to be automated... The application will be a deamon that will load the web page every morning at 1am.

    I am currently looking at QWebPage but have some difficulties use this QWebKit module... looks like i always have to use QWebPage then QWebFrame then QWebElement....

    Not sure am on the right track.

  6. #6
    Join Date
    Sep 2008
    Location
    Munich
    Posts
    32
    Thanked 8 Times in 6 Posts
    Qt products
    Qt3 Qt4 Qt/Embedded
    Platforms
    MacOS X Unix/X11 Windows

    Default Re: Need a QT class to handle HTML documents...

    Maybe you can use an XQuery see http://doc.qt.nokia.com/4.7-snapshot/qxmlquery.html. If you document is not valid XML, it might be even better to parse it via a reg-exp and extract the required information.

    XQuery adds some complexity to your project, as you need to understand it first ;-) Here is a tutorial: http://www.w3schools.com/xquery/default.asp.

    Good luck!

  7. #7
    Join Date
    Apr 2010
    Posts
    769
    Thanks
    1
    Thanked 94 Times in 86 Posts
    Qt products
    Qt3 Qt4
    Platforms
    Unix/X11

    Default Re: Need a QT class to handle HTML documents...

    If the web page will never change, then just slurp the HTML into a string and parse it yourself, perhaps using regular expressions. Or, if the page section containing whatever you're interested in conforms to XML specifications, extract that and hand it off to the XML parser for final processing.

  8. #8
    Join Date
    Jul 2008
    Posts
    12
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Need a QT class to handle HTML documents...

    One thing for sure, we cannot use QRegExp as it is not compliant with standard RegExp expressions....
    Again, assuming we want to extract "this is a test" from "<TH>this is a test</TD>", QRegExp would handle lookahead but not the backward equivalent so it is possible to say "return string that is immediatly followed by </TD>" but it is not possible to have "return string that immediatly follows the string <TH>".
    So at best i could extract the following string:
    "<TH>this is a test"

    Anyone knows how to do this ?

Similar Threads

  1. Creating documents from Qt
    By GrahamLabdon in forum Qt Programming
    Replies: 16
    Last Post: 26th April 2010, 10:00
  2. QDockWidget and Multiple Documents
    By freelucas in forum Qt Programming
    Replies: 0
    Last Post: 19th February 2010, 11:44
  3. Can Qt handle pdf documents???
    By webquinty in forum Qt for Embedded and Mobile
    Replies: 1
    Last Post: 23rd March 2009, 15:16
  4. html parsing class problem
    By yagabey in forum Qt Programming
    Replies: 4
    Last Post: 22nd December 2008, 18:52
  5. opengl documents
    By neomax in forum General Discussion
    Replies: 1
    Last Post: 20th November 2006, 07:27

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Qt is a trademark of The Qt Company.