Results 1 to 6 of 6

Thread: convert ampersand encoded HTML into something readable

  1. #1
    Join Date
    Sep 2010
    Posts
    4
    Thanks
    2
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default convert ampersand encoded HTML into something readable

    I'm extracting strings from a webpage, but they are full of national characters encoded in html ampersand - hash - ascii code digits - semicolon format, looking something like this:

    "sekretær i kø"

    This is awful, and I can't find any information on how to decode these special characters, to the point where I'm close to writing a parser just for the purpose. The web page is encoded in utf-8 format, and the browser has no problem displaying it but in Qt all I get is mangled text strings...

    Does anyone know how to read html encoded characters right?
    Last edited by Lykurg; 17th October 2010 at 11:29.

  2. #2
    Join Date
    Jan 2006
    Location
    Germany
    Posts
    4,380
    Thanks
    19
    Thanked 1,005 Times in 913 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows Symbian S60
    Wiki edits
    5

    Default Re: convert ampersand encoded HTML into something readable

    Do not double post! I'll close this one.

    EDIT: Oh, come on, decide where you want to post before you post! And once you posted, don't change and make the first one unreadable. Ask a moderator for moving your post if really needed.

    For educational purpose, I leave this one closed. Edit your first one and you will be get an answer.
    Last edited by Lykurg; 17th October 2010 at 00:23.

  3. #3
    Join Date
    Mar 2009
    Location
    Brisbane, Australia
    Posts
    7,729
    Thanks
    13
    Thanked 1,610 Times in 1,537 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows
    Wiki edits
    17

    Default Re: convert ampersand encoded HTML into something readable

    0xFFFF ? A Unicode non-character perhaps? What was the question?

  4. #4
    Join Date
    Jan 2006
    Location
    Germany
    Posts
    4,380
    Thanks
    19
    Thanked 1,005 Times in 913 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows Symbian S60
    Wiki edits
    5

    Default Re: convert ampersand encoded HTML into something readable

    Ok, a little mess here, but now both threads or merged and open again. ChrisW67's answer was referring to the now deleted post which hadn't had a question...

  5. #5
    Join Date
    Jan 2006
    Location
    Germany
    Posts
    4,380
    Thanks
    19
    Thanked 1,005 Times in 913 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows Symbian S60
    Wiki edits
    5

    Default Re: convert ampersand encoded HTML into something readable

    Ok, and now to prove that we are gentile here:

    Have a look at QTextDocument. Set the html and receive the plain text back. A more lightweighted solution would be to search for such notations and replace them by hand.

  6. The following user says thank you to Lykurg for this useful post:

    tetsuoii (24th October 2010)

  7. #6
    Join Date
    Sep 2010
    Posts
    4
    Thanks
    2
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Thumbs up Re: convert ampersand encoded HTML into something readable

    sorry 'bout the double posting, i'll try not to heat your helmet next time =)

    anyway, setting all labels to label->setTextFormat(Qt::RichText); fixed all my problems, both the one described above and the one where I couldn't use norwegian ascii characters which I had to substitute with &#0230; etc. that don't display like "h<?>lvetes j<?>vla kr<?>ket<?>r" anymore!

    It also improved my mood, which was on a slope..., So to all scandinavian, french, german, polish and other special character users, the setTextFormat( Qt::RichText ); function is highly recommended!

    And thanks alot to you, Helmet-Man, for your valuable advice which may have saved me days of work!

Similar Threads

  1. How to convert text to HTML in QTextEdit
    By Roszko in forum Newbie
    Replies: 5
    Last Post: 31st December 2009, 10:40
  2. Convert html links in widget actions
    By jiveaxe in forum Qt Programming
    Replies: 4
    Last Post: 16th November 2009, 13:17
  3. Readable Xml with QXmlStreamWriter
    By jano_alex_es in forum Newbie
    Replies: 8
    Last Post: 26th August 2009, 12:53
  4. ampersand showing in QLabel
    By dave in forum Newbie
    Replies: 6
    Last Post: 7th November 2006, 07:15

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.