A regular approach would be to use QWebElement API to parse page content.