PDA

View Full Version : Unicode, linux, bash shell, printf



sandor
18th July 2012, 20:44
This must be really simple... I have a unicode QString and I would like to print the string to the shell (konsole, bash). The console has the appropriate characters so touch and mkdir can create unicode file names -> printing unicode characters must not be the problem. What I have:


QString s = "áéőűÁÉŐŰ";


How do I print that string to the console that those characters appear?

wysota
18th July 2012, 21:40
What encoding that your console use?

ChrisW67
19th July 2012, 01:07
Your test string of eight "characters" is represented as the sixteen bytes (hex):


á é ő ű Á É Ő Ű
C3 A1 C3 A9 C5 91 C5 B1 C3 81 C3 89 C5 90 C5 B0

when encoded as UTF-8 (which is the most likely thing your editor has done). Constructing or assigning a QString from char* uses fromAscii() which will be misinterpreting the bytes when converting to QString's internal 16-bit character format. Try this:


QString s = QString::fromUtf8("áéőűÁÉŐŰ");

sandor
19th July 2012, 19:40
Well, hmm, it does not quite work. Actually the problem is not how to put into QString, it is the easy part. The problem is how to actually print.
Technically qDebug() << str; works fine for unicode; however qDebug, qWarning, etc. are overwritten by installing a different message handler.

The question is very simple:


QString s = "őű";
// ????: printf("%s\n", s);
// ????: printf("%s\n", s.toAscii());
// ????: etc.
// What to put there instead of printf?


console used: konsole

wysota
19th July 2012, 22:20
Please answer my question --- what encoding does your console use. If it's utf-8 then qDebug() << str.toUtf8() should work. If it's something else then probably str.to8BitEncoding() will work.

sandor
20th July 2012, 00:11
I tried both of those down below. Any idea would help.



static QTextCodec *pCodec = QTextCodec::codecForName("UTF-8");

// pCodec is available through the application; does not go out of scope, not that it would even matter

if (pCodec == 0) {
qCritical("UTF-8 is not supported by the O/S or shell");
return;
}

QTextCodec::setCodecForCStrings(pCodec);
QTextCodec::setCodecForLocale(pCodec);
QTextCodec::setCodecForTr(pCodec);

// Print here with anything else but qDebug, qWarning, qCritical and qFatal


The default encoding of Konsole on CentOS is UTF-8; and that is used.

Added after 28 minutes:

Actually, solved. I think build was not cleaned properly.

ChrisW67
20th July 2012, 02:36
Well, hmm, it does not quite work. Actually the problem is not how to put into QString, it is the easy part.
I don't think you understand the problem.


The problem is how to actually print.
The question is very simple:


QString s = "őű";
// ????: printf("%s\n", s);
// ????: printf("%s\n", s.toAscii());
// ????: etc.
// What to put there instead of printf?



Well, printf() will work, and so will std::cout, QTextStream and qDebug but you'll will only see the characters you expect if they get into the QString correctly in the first place.



#include <QtCore>
#include <QDebug>
#include <iostream>

void output(const QString &s) {
printf("%s\n", s.toUtf8().data());
std::cout << s.toUtf8().data() << std::endl;
QTextStream out(stdout);
out << s << endl;
qDebug() << s;
qDebug() << "-----";
}


int main(int argc, char **argv)
{
QCoreApplication app(argc, argv);

QString s("áéőűÁÉŐŰ"); // mangled by toAscii()
output(s);
s = "áéőűÁÉŐŰ"; // mangled by toAscii() too
output(s);
s = QString::fromUtf8("áéőűÁÉŐŰ"); // not mangled
output(s);

return 0;
}



Just for reference I am using Konsole with UTF8 encoding and suitable fonts. This is the output:


áéÅűÃÃÅÅ °
áéÅűÃÃÅÅ °
áéÅűÃÃÅÅ °
"áéÅűÃÃÅÅ °"
-----
áéÅűÃÃÅÅ °
áéÅűÃÃÅÅ °
áéÅűÃÃÅÅ °
"áéÅűÃÃÅÅ °"
-----
áéőűÁÉŐŰ
áéőűÁÉŐŰ
áéőűÁÉŐŰ
"áéőűÁÉŐŰ"
-----