PDA

View Full Version : HTML Unicode ampersand-encoding



Neptilo
18th December 2012, 08:58
I would like to convert a QString containing Unicode characters to plain HTML text. For instance "私" would become "&" followed by "#31169;" (couldn't display it in a single string).
Is there such a functionality in Qt? I found the function QString Qt::escape ( const QString & plain ) but it only converts HTML metacharacters <, >, &, and ".

Waiting for a better way, I tried to write my own encoding function:


QString ampersand_encode(const QString& str){

QString chr;
QStringList list = QStringList();

for (int i = 0; i < str.size(); ++i) {
chr = QString(str[i]);
list << "&#x" + QString(chr.toUtf8().toHex()) + ";";
}

return list.join("");
}

It almost works. It works for ASCII characters but when I try it with other Unicode characters I only get Korean characters. Why? I feel there's not much to change but I don't know what. Also, any improvement of my code would be appreciated. :)

wysota
18th December 2012, 10:19
How about this?


QString ampersand_encode(const QString &string) {
QString encoded;
for(int i=0;i<string.size();++i) {
QChar ch = string.at(i);
if(ch.unicode() > 255)
encoded += QString("&#%1;").arg((int)ch.unicode());
else
encoded += ch;
}
return encoded;
}

Neptilo
19th December 2012, 22:01
Thank you! It works!

Actually I realized my tests were wrong. I entered text directly in my code and Unicode characters were split in 8-bit characters. Now that I enter text in a QLineEdit, it works much better.