QT4.6 Converting utf8 to '/uwxyz' and back.
I need to convert ut8 (currently read in by a QTextStream readLine() ) to ascii text:
1) if ascii (<=7F), then just use the ascii value so 'a' to 'a'
2) if not ascii then convert utf8 to '/u' followed by the hex value wxyz.
For example the Euro symbol would be the 6 ascii charaters '/u20AC'
I also have to go the other way where I have the string '/u20AC' and want to output as utf8 for the Euro.
I am having troubles determining whether QT string functions can help me with this or not.
It looks like if I use 'toUtf8 I will get a byte array with the euro as bytes E2 82 AC and could parse manually, but that is a bunch of work.
Is there a way I can get the unicode hex values from the utf8 QString?
Re: QT4.6 Converting utf8 to '/uwxyz' and back.
i tried it but i couldn't too
Re: QT4.6 Converting utf8 to '/uwxyz' and back.
There is no such beast as a "utf8 QString". QString is a collection of QChar, essentially 16-bit Unicode basic multilingual plane code points that are trivially accessible using QString::at() or other methods. The file or stream you are reading from may be UTF-8 encoded and decoded by QTextStream.
Code:
#include <QCoreApplication>
#include <QFile>
#include <QTextStream>
#include <QDebug>
int main(int argc, char **argv)
{
in.setCodec("UTF-8");
qDebug() << line;
for(int i = 0; i < line.size(); ++i) {
const ushort code = line.at(i).unicode();
if (code < 0x0080)
result += line.at(i);
else
}
qDebug() << result;
}
return 0;
}
Output:
Code:
"test €1234"
"test \u20ac1234"
Re: QT4.6 Converting utf8 to '/uwxyz' and back.
... as far as the line QString contains unicode characters below 0xFFFF (in fact, below 0xD800). The unicode in Qt is, in fact UTF16 and the characters above 0xFFFF are encoded as two ushorts. If the characters above 0xFFFF threaten, then use a small improvement:
Code:
#include <QVector>
...
qDebug() << line;
QVector<uint> utf8 = line.toUcs4();
for( int i = 0; i < utf8.size(); ++i )
{
const uint code = utf8.at(i);
if( code < 0x0080 ) result += line.at(i);
else result
+= QString("\\u%1").
arg(code,
4,
16,
QChar('0'));
}
qDebug() << result;
Re: QT4.6 Converting utf8 to '/uwxyz' and back.
I have gotten it to work using the ideas here:
Setting the QTextStream to utf8 and then working character by character