Results 1 to 5 of 5

Thread: Convert from iso-8859-1 to... Something else :-)

  1. #1
    Join Date
    Feb 2007
    Posts
    158
    Thanks
    25
    Qt products
    Qt4
    Platforms
    Windows

    Default Convert from iso-8859-1 to... Something else :-)

    Hello,

    I'm having troubles trying to convert a iso-8859-1 QString to something better, like an utf-8 QString.

    A QByteArray (data) contains some mail headers (From, To, Subject, ...), and is downloaded from a server with a QTcpSocket. So, here is the code :
    Qt Code:
    1. //data = QByteArray, contains the mail headers
    2. liste_from = QString::fromLatin1(data.constData()).trimmed().split(' ');
    To copy to clipboard, switch view to plain text mode 

    Here I select the mail autor, the QString that I need to work with :
    Qt Code:
    1. QString mail_autor = liste_from.operator[](1);
    2. std::cout << "AUTOR : \t\t" << mail_autor.toStdString() << std::endl;
    To copy to clipboard, switch view to plain text mode 

    Like mail is encoded in iso-8859-1 (RFC 2047), I get something like this :
    Qt Code:
    1. =?iso-8859-1?Q?R=E9gis?=
    To copy to clipboard, switch view to plain text mode 

    But I need to convert it, perhaps in utf-8, to get something like this :
    Qt Code:
    1. Régis
    To copy to clipboard, switch view to plain text mode 

    So I want to use some Qt functionalities, like QString::toUtf8() but nothing happens...
    Qt Code:
    1. QString mail_autor = liste_from.operator[](1);
    2. QByteArray mail_autor_converted = mail_autor.toUtf8();
    3. std::cout << "AUTOR : \t\t" << mail_autor_converted.constData() << std::endl;
    4. // Display : =?iso-8859-1?Q?R=E9gis?=
    To copy to clipboard, switch view to plain text mode 

    I don't understand why the convertion isn't done.
    Could you help me please ?

  2. #2
    Join Date
    Jan 2006
    Posts
    128
    Thanked 28 Times in 27 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Convert from iso-8859-1 to... Something else :-)

    The problem is, that your string is not encoded in latin1 (aka iso-8859-1). Your string is encoded in the RFC 2045 representation of an latin1 string.

    You will have to write a decoder that recreates a valid latin1 character array from this representation, which you can then feed to QString::fromLatin1.

    As inspiration on how to do that here the result of a search via google codesearch: ;-)
    rfc2047.c from mutt seems to contain a decoding function.

    (If you let yourself inspire by other code: Mind the license...mmmmKay?)

  3. The following user says thank you to camel for this useful post:

    Nyphel (6th March 2007)

  4. #3
    Join Date
    Feb 2007
    Posts
    158
    Thanks
    25
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Convert from iso-8859-1 to... Something else :-)

    Thanks, I understand a little better the problem now.
    I'll see that later, I can't now, but thanks a lot for the tip

  5. #4
    Join Date
    Feb 2007
    Posts
    158
    Thanks
    25
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Convert from iso-8859-1 to... Something else :-)

    The work needed for decoding the RFC 2045 is too big for me.
    There is multiple case, many rules to match, etc.

    I made a simple function that replace each strange caracter by his ISO value.
    Here is the function... Case someone need it one day

    Qt Code:
    1. void MailChecker::decodage(QString & chaine)
    2. {
    3. // VOIR : http://fr.wikipedia.org/wiki/ISO_8859-1
    4.  
    5. chaine = chaine.section('?',3,3);
    6.  
    7. chaine.replace(QString("_"), QString(" "));
    8.  
    9. chaine.replace(QString("=20"), QString(" "));
    10. chaine.replace(QString("=21"), QString("!"));
    11. chaine.replace(QString("=22"), QString("\""));
    12. chaine.replace(QString("=23"), QString("#"));
    13. chaine.replace(QString("=24"), QString("$"));
    14. chaine.replace(QString("=25"), QString("%"));
    15. chaine.replace(QString("=26"), QString("&"));
    16. chaine.replace(QString("=27"), QString("'"));
    17. chaine.replace(QString("=28"), QString("("));
    18. chaine.replace(QString("=29"), QString(")"));
    19. chaine.replace(QString("=2A"), QString("*"));
    20. chaine.replace(QString("=2B"), QString("+"));
    21. chaine.replace(QString("=2C"), QString(","));
    22. chaine.replace(QString("=2D"), QString("-"));
    23. chaine.replace(QString("=2E"), QString("."));
    24. chaine.replace(QString("=2F"), QString("/"));
    25.  
    26. chaine.replace(QString("=30"), QString("0"));
    27. chaine.replace(QString("=31"), QString("1"));
    28. chaine.replace(QString("=32"), QString("2"));
    29. chaine.replace(QString("=33"), QString("3"));
    30. chaine.replace(QString("=34"), QString("4"));
    31. chaine.replace(QString("=35"), QString("5"));
    32. chaine.replace(QString("=36"), QString("6"));
    33. chaine.replace(QString("=37"), QString("7"));
    34. chaine.replace(QString("=38"), QString("8"));
    35. chaine.replace(QString("=39"), QString("9"));
    36. chaine.replace(QString("=3A"), QString(":"));
    37. chaine.replace(QString("=3B"), QString(";"));
    38. chaine.replace(QString("=3C"), QString("<"));
    39. chaine.replace(QString("=3D"), QString("="));
    40. chaine.replace(QString("=3E"), QString(">"));
    41. chaine.replace(QString("=3F"), QString("?"));
    42.  
    43. chaine.replace(QString("=40"), QString("@"));
    44. chaine.replace(QString("=41"), QString("A"));
    45. chaine.replace(QString("=42"), QString("B"));
    46. chaine.replace(QString("=43"), QString("C"));
    47. chaine.replace(QString("=44"), QString("D"));
    48. chaine.replace(QString("=45"), QString("E"));
    49. chaine.replace(QString("=46"), QString("F"));
    50. chaine.replace(QString("=47"), QString("G"));
    51. chaine.replace(QString("=48"), QString("H"));
    52. chaine.replace(QString("=49"), QString("I"));
    53. chaine.replace(QString("=4A"), QString("J"));
    54. chaine.replace(QString("=4B"), QString("K"));
    55. chaine.replace(QString("=4C"), QString("L"));
    56. chaine.replace(QString("=4D"), QString("M"));
    57. chaine.replace(QString("=4E"), QString("N"));
    58. chaine.replace(QString("=4F"), QString("O"));
    59.  
    60. chaine.replace(QString("=50"), QString("P"));
    61. chaine.replace(QString("=51"), QString("Q"));
    62. chaine.replace(QString("=52"), QString("R"));
    63. chaine.replace(QString("=53"), QString("S"));
    64. chaine.replace(QString("=54"), QString("T"));
    65. chaine.replace(QString("=55"), QString("U"));
    66. chaine.replace(QString("=56"), QString("V"));
    67. chaine.replace(QString("=57"), QString("W"));
    68. chaine.replace(QString("=58"), QString("X"));
    69. chaine.replace(QString("=59"), QString("Y"));
    70. chaine.replace(QString("=5A"), QString("Z"));
    71. chaine.replace(QString("=5B"), QString("["));
    72. chaine.replace(QString("=5C"), QString("\\"));
    73. chaine.replace(QString("=5D"), QString("]"));
    74. chaine.replace(QString("=5E"), QString("^"));
    75. chaine.replace(QString("=5F"), QString("_"));
    76.  
    77. //chaine.replace(QString("=60"), QString(""));
    78. chaine.replace(QString("=61"), QString("a"));
    79. chaine.replace(QString("=62"), QString("b"));
    80. chaine.replace(QString("=63"), QString("c"));
    81. chaine.replace(QString("=64"), QString("d"));
    82. chaine.replace(QString("=65"), QString("e"));
    83. chaine.replace(QString("=66"), QString("f"));
    84. chaine.replace(QString("=67"), QString("g"));
    85. chaine.replace(QString("=68"), QString("h"));
    86. chaine.replace(QString("=69"), QString("i"));
    87. chaine.replace(QString("=6A"), QString("j"));
    88. chaine.replace(QString("=6B"), QString("k"));
    89. chaine.replace(QString("=6C"), QString("l"));
    90. chaine.replace(QString("=6D"), QString("m"));
    91. chaine.replace(QString("=6E"), QString("n"));
    92. chaine.replace(QString("=6F"), QString("o"));
    93.  
    94. chaine.replace(QString("=70"), QString("p"));
    95. chaine.replace(QString("=71"), QString("q"));
    96. chaine.replace(QString("=72"), QString("r"));
    97. chaine.replace(QString("=73"), QString("s"));
    98. chaine.replace(QString("=74"), QString("t"));
    99. chaine.replace(QString("=75"), QString("u"));
    100. chaine.replace(QString("=76"), QString("v"));
    101. chaine.replace(QString("=77"), QString("w"));
    102. chaine.replace(QString("=78"), QString("x"));
    103. chaine.replace(QString("=79"), QString("y"));
    104. chaine.replace(QString("=7A"), QString("z"));
    105. chaine.replace(QString("=7B"), QString("{"));
    106. chaine.replace(QString("=7C"), QString("|"));
    107. chaine.replace(QString("=7D"), QString("}"));
    108. chaine.replace(QString("=7E"), QString("~"));
    109.  
    110. chaine.replace(QString("=A0"), QString(" "));
    111. //chaine.replace(QString("=A1"), QString(""));
    112. //chaine.replace(QString("=A2"), QString(""));
    113. //chaine.replace(QString("=A3"), QString(""));
    114. //chaine.replace(QString("=A4"), QString(""));
    115. //chaine.replace(QString("=A5"), QString(""));
    116. chaine.replace(QString("=A6"), QString("|"));
    117. //chaine.replace(QString("=A7"), QString(""));
    118. //chaine.replace(QString("=A8"), QString(""));
    119. //chaine.replace(QString("=A9"), QString(""));
    120. //chaine.replace(QString("=AA"), QString(""));
    121. //chaine.replace(QString("=AB"), QString(""));
    122. //chaine.replace(QString("=AC"), QString(""));
    123. //chaine.replace(QString("=AD"), QString(""));
    124. //chaine.replace(QString("=AE"), QString(""));
    125. //chaine.replace(QString("=AF"), QString(""));
    126.  
    127. //chaine.replace(QString("=B0"), QString(""));
    128. //chaine.replace(QString("=B1"), QString(""));
    129. //chaine.replace(QString("=B2"), QString(""));
    130. //chaine.replace(QString("=B3"), QString(""));
    131. //chaine.replace(QString("=B4"), QString(""));
    132. //chaine.replace(QString("=B5"), QString(""));
    133. //chaine.replace(QString("=B6"), QString(""));
    134. //chaine.replace(QString("=B7"), QString(""));
    135. //chaine.replace(QString("=B8"), QString(""));
    136. //chaine.replace(QString("=B9"), QString(""));
    137. //chaine.replace(QString("=BA"), QString(""));
    138. //chaine.replace(QString("=BB"), QString(""));
    139. //chaine.replace(QString("=BC"), QString(""));
    140. //chaine.replace(QString("=BD"), QString(""));
    141. //chaine.replace(QString("=BE"), QString(""));
    142. //chaine.replace(QString("=BF"), QString(""));
    143.  
    144. //chaine.replace(QString("=C0"), QString(""));
    145. //chaine.replace(QString("=C1"), QString(""));
    146. //chaine.replace(QString("=C2"), QString(""));
    147. //chaine.replace(QString("=C3"), QString(""));
    148. //chaine.replace(QString("=C4"), QString(""));
    149. //chaine.replace(QString("=C5"), QString(""));
    150. //chaine.replace(QString("=C6"), QString(""));
    151. //chaine.replace(QString("=C7"), QString(""));
    152. //chaine.replace(QString("=C8"), QString(""));
    153. //chaine.replace(QString("=C9"), QString(""));
    154. //chaine.replace(QString("=CA"), QString(""));
    155. //chaine.replace(QString("=CB"), QString(""));
    156. //chaine.replace(QString("=CC"), QString(""));
    157. //chaine.replace(QString("=CD"), QString(""));
    158. //chaine.replace(QString("=CE"), QString(""));
    159. //chaine.replace(QString("=CF"), QString(""));
    160.  
    161. //chaine.replace(QString("=D0"), QString(""));
    162. //chaine.replace(QString("=D1"), QString(""));
    163. //chaine.replace(QString("=D2"), QString(""));
    164. //chaine.replace(QString("=D3"), QString(""));
    165. //chaine.replace(QString("=D4"), QString(""));
    166. //chaine.replace(QString("=D5"), QString(""));
    167. //chaine.replace(QString("=D6"), QString(""));
    168. //chaine.replace(QString("=D7"), QString(""));
    169. //chaine.replace(QString("=D8"), QString(""));
    170. //chaine.replace(QString("=D9"), QString(""));
    171. //chaine.replace(QString("=DA"), QString(""));
    172. //chaine.replace(QString("=DB"), QString(""));
    173. //chaine.replace(QString("=DC"), QString(""));
    174. //chaine.replace(QString("=DD"), QString(""));
    175. //chaine.replace(QString("=DE"), QString(""));
    176. //chaine.replace(QString("=DF"), QString(""));
    177.  
    178. chaine.replace(QString("=E0"), QString("Ã "));
    179. //chaine.replace(QString("=E1"), QString(""));
    180. chaine.replace(QString("=E2"), QString("â"));
    181. chaine.replace(QString("=E3"), QString("ã"));
    182. chaine.replace(QString("=E4"), QString("ä"));
    183. //chaine.replace(QString("=E5"), QString(""));
    184. //chaine.replace(QString("=E6"), QString(""));
    185. chaine.replace(QString("=E7"), QString("ç"));
    186. chaine.replace(QString("=E8"), QString("è"));
    187. chaine.replace(QString("=E9"), QString("é"));
    188. chaine.replace(QString("=EA"), QString("ê"));
    189. chaine.replace(QString("=EB"), QString("ë"));
    190. //chaine.replace(QString("=EC"), QString(""));
    191. //chaine.replace(QString("=ED"), QString(""));
    192. chaine.replace(QString("=EE"), QString("î"));
    193. chaine.replace(QString("=EF"), QString("ï"));
    194.  
    195. //chaine.replace(QString("=F0"), QString(""));
    196. chaine.replace(QString("=F1"), QString("ñ"));
    197. //chaine.replace(QString("=F2"), QString(""));
    198. //chaine.replace(QString("=F3"), QString(""));
    199. chaine.replace(QString("=F4"), QString("ô"));
    200. chaine.replace(QString("=F5"), QString("õ"));
    201. chaine.replace(QString("=F6"), QString("ö"));
    202. //chaine.replace(QString("=F7"), QString(""));
    203. //chaine.replace(QString("=F8"), QString(""));
    204. //chaine.replace(QString("=F9"), QString(""));
    205. //chaine.replace(QString("=FA"), QString(""));
    206. chaine.replace(QString("=FB"), QString("û"));
    207. chaine.replace(QString("=FC"), QString("ü"));
    208. //chaine.replace(QString("=FD"), QString(""));
    209. //chaine.replace(QString("=FE"), QString(""));
    210. chaine.replace(QString("=FF"), QString("ÿ"));
    211. }
    To copy to clipboard, switch view to plain text mode 

  6. #5
    Join Date
    May 2006
    Posts
    788
    Thanks
    49
    Thanked 48 Times in 46 Posts
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Default Re: Convert from iso-8859-1 to... Something else :-)

    Quote Originally Posted by Nyphel View Post
    The work needed for decoding the RFC 2045 is too big for me.
    There is multiple case, many rules to match, etc.

    I made a simple function that replace each strange caracter by his ISO value.
    Here is the function... Case someone need it one day
    I suppose if You regex out "=" and replace by & you optain the same as urldecode.. or i write mistake??

    This is a small piece to read Cookie from server...
    after i write this ... i see .... QUrl can decode....
    but i read cookie now from Url_Decode


    Qt Code:
    1. /* encode to url strings */
    2. QString EncodeUrlPart( QString xml )
    3. {
    4. QUrl urlmod(QString("http://localhost/%1").arg(xml));
    5. QByteArray capsed(urlmod.toEncoded());
    6. QString res = QString("%1").arg(capsed.data());
    7. res = res.replace("%20","_");
    8. res = res.replace("%","");
    9. QUrl urlmod2(res);
    10. res = urlmod2.path ();
    11. res = res.replace("/","");
    12. return res;
    13. }
    14.  
    15.  
    16. /* decode url from cookie or other */
    17. QString Url_Decode( QString indata )
    18. {
    19. /*
    20. http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
    21. Dollar ("$") 24
    22. Ampersand ("&") 26
    23. Plus ("+") 2B
    24. Comma (",") 2C
    25. Forward slash/Virgule ("/") 2F
    26. Colon (":") 3A
    27. Semi-colon (";") 3B
    28. Equals ("=") 3D
    29. Question mark ("?") 3F
    30. 'At' symbol ("@") 40
    31. Left Curly Brace ("{") 7B
    32. Right Curly Brace ("}") 7D
    33. Vertical Bar/Pipe ("|") 7C
    34. Backslash ("\") 5C
    35. Caret ("^") 5E
    36. Tilde ("~") 7E
    37. Left Square Bracket ("[") 5B
    38. Right Square Bracket ("]") 5D
    39. Grave Accent ("`") 60
    40. */
    41. QString blnull = "";
    42. QString notaccept = "%60|%5D|%5B|%7E|%5E|%5C|%7C|%7D|%7B";
    43. QStringList notallow;
    44. notallow = notaccept.split("|");
    45.  
    46. for (int i = 0; i < notallow.size(); ++i) {
    47. if ( indata.contains(notallow.at(i)) ) {
    48. return blnull;
    49. }
    50. }
    51.  
    52. QString spaceout = indata.replace("%20"," ");
    53. spaceout = spaceout.replace("%3A",":");
    54. spaceout = spaceout.replace("%3B",";");
    55. spaceout = spaceout.replace("%3D","=");
    56. spaceout = spaceout.replace("%2F","/");
    57. spaceout = spaceout.replace("%3F","?");
    58. spaceout = spaceout.replace("%40","@");
    59. spaceout = spaceout.replace("%24","$");
    60. spaceout = spaceout.replace("%2B","+");
    61. spaceout = spaceout.replace("+"," ");
    62. int zool = spaceout.indexOf(";",0);
    63. return spaceout.left(zool);;
    64. }
    To copy to clipboard, switch view to plain text mode 

Similar Threads

  1. Convert QPixmap to QByteArray ?
    By probine in forum Qt Programming
    Replies: 5
    Last Post: 13th March 2014, 09:23
  2. How to convert from QString to string ?
    By probine in forum Newbie
    Replies: 2
    Last Post: 1st December 2010, 02:50
  3. How to convert from QString to quint16 ?
    By probine in forum Qt Programming
    Replies: 5
    Last Post: 31st March 2006, 10:00
  4. convert iterator
    By mickey in forum General Programming
    Replies: 8
    Last Post: 20th March 2006, 22:59
  5. How to convert binary data to hexadecimal data
    By yellowmat in forum Newbie
    Replies: 4
    Last Post: 8th March 2006, 17:17

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.