PDA

View Full Version : How to write a Russian text in console?



8Observer8
28th September 2013, 11:12
Hi,

How to write a Russian text in console?

This is my code:



#include <QCoreApplication>
#include <QDebug>
#include <QTextCodec>

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

QTextCodec *russian =QTextCodec::codecForName("CP1251");
QTextCodec::setCodecForTr(russian);
qDebug() << QObject::tr("Привет Мир");

return a.exec();
}


Output:


"╨Я╤А╨╕╨■╨╡╤В ╨Ь╨╕╤А"

Thank you!

anda_skoa
28th September 2013, 13:39
I think qDebug() doesn't use codecs, however you could try:
- using toLocal8Bit() on the string returned from tr()
- use the QTextCodec to convert the string to QByteArray
- use a QTextStream with the codec on the stdout file handle

Cheers,
_

8Observer8
28th September 2013, 14:05
#include <QCoreApplication>
#include <QDebug>
#include <QTextCodec>
#include <QTextStream>

QTextStream cout(stdout);

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

QTextCodec *russian =QTextCodec::codecForName("CP1251");
QTextCodec::setCodecForTr(russian);
cout << QObject::tr("Привет Мир");
cout.flush();

return a.exec();
}


Output:


╨Я╤А╨╕╨■╨╡╤В ╨Ь╨╕╤А

anda_skoa
28th September 2013, 16:50
That is essentially the same thing.



cout << QObject::tr("Привет Мир");

Since there is no operator<<(std::ostream&, const QString&) this will cause an implicit conversion from QString to something that does have such an operator.
And that thing is const char*. Now the cast operator that creates const char* from QString uses the same encoding as QString::toLatin1(). Your Russian text is not latin1.

Cheers,
_

Radek
28th September 2013, 17:12
#include <QCoreApplication>
#include <QDebug>

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

qDebug() << QString::fromUtf8("Привет Мир");

return a.exec();
}



At least, this works on Debian. QString is Unicode internally so that no problems with Cyrillic but you will need a Unicode font. If you need to output char * or QByteArray recode to Unicode ordinals using fromUtf8() first.

8Observer8
28th September 2013, 17:29
It doesn't work:



#include <QCoreApplication>
#include <QDebug>

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

qDebug() << QString::fromUtf8("Привет Мир");

return a.exec();
}


Output:


"╧ЁштхЄ ╠шЁ"


I will read about it. Thank you.

Radek
28th September 2013, 18:18
Then it is the font, it isn't Unicode or it uses some strange encoding. Note that the number of letters is now correct (it wasn't before, the output was a wrong interpreted Unicode encodings). Also, it can be a missing or incomplete support of Unicode on your machine. Check the contents of the string.



QString str = QString::fromUtf8("Привет Мир");
int len = str.size();
int chk;

for( int i = 0; i < len; i++ )
{
chk = str.at(i).unicode();
}


Run debugger and see the values in chk. They should be unicode ordinals: 0x41F 0x440 0x438 0x432 0x442 0x20 0x41C 0x438 0x440. If they aren't then the unicode support isn't good (as you type the text to the editor, a wrong encodings are generated). If they are then the output font isn't unicode.

toufic.dbouk
28th September 2013, 18:30
Hey ,
i experienced something similar to this.
first try to add your Russian text in a combo box and see it if correctly shows.
check your system's encoding and get familiar with it.
save your source file as UTF-8.

8Observer8
29th September 2013, 09:01
Yes! It works! I will be trying yet.

9645



#include <QApplication>
#include <QtGui>

int main(int argc, char *argv[]) {
QApplication app(argc, argv);

QLabel *lblText = new QLabel(QString::fromUtf8("Привет Мир!"));
lblText->show();

return app.exec();
}

Radek
29th September 2013, 09:35
This solves labels and other GUI strings and shows that the internal Unicode traffic works at your machine. It also shows that the "Привет Мир!" literal is processed correctly by your editor (where you write your code). But it does not solve the console output (on my Debian Wheezy the console works). Try setting another font for the console (there should be some, because labels work) or try another console (from Qt Creator: Tools -> Options -> Environment -> Terminal).

anda_skoa
29th September 2013, 11:44
Just checking: you have tried, lets say my first suggestion, right?
I.e. using toLocal8Bit() and it did not work?



#include <QCoreApplication>
#include <QDebug>

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

qDebug() << QString::fromUtf8("Привет Мир").toLocal8Bit();

return a.exec();
}


Cheers,
_

toufic.dbouk
29th September 2013, 12:36
using toLocal8Bit() and it did not work?.

qDebug() << QString::fromUtf8("/*some Arabic word*/").toLocal8Bit();
in my case ( trying to show Arabic alphabets ) it just shows * ??? * instead of the actual letters.
Just want to point this out.

patrik08
29th September 2013, 16:39
untested you can try...

or set
setCodecForTr(russian); to out && in





int main(int argc, char *argv[]) {
QApplication app(argc, argv);
QTextCodec::setCodecForTr(QTextCodec::codecForName ("CP1251"));

QTextStream out(stdout);
/// QTextCodec *russian = QTextCodec::codecForName("CP1251");
/// or out.setCodecForTr(russian);
QString str("*");
out << str.fill('*', 80) << "\n";
out.flush();

out << "Please enter word to search russian keyboard:\n";
out.flush();
QTextStream in(stdin);
out.flush();
search_word = in.readLine(); /// here new word to insert on class..
out << "Your word:" << search_word << "\n";
out << str.fill('*', 80) << "\n";

/// return 1; or
QTimer::singleShot(10000, &app, SLOT(quit()));
return app.exec();
}

8Observer8
8th October 2013, 11:32
Just checking: you have tried, lets say my first suggestion, right?
I.e. using toLocal8Bit() and it did not work?



#include <QCoreApplication>
#include <QDebug>

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

qDebug() << QString::fromUtf8("Привет Мир").toLocal8Bit();

return a.exec();
}


Cheers,
_

Thank you. It doesn't work.

Added after 19 minutes:

Why it doesn't work?


#include <QCoreApplication>
#include <QDebug>
#include <QTextStream>

QTextStream cout(stdout);

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

QString russian = QString::fromUtf8("Привет, Мир!");
qDebug() << russian;
cout << russian << endl;
cout.flush();

return a.exec();
}


Output:


"╧ЁштхЄ, ╠шЁ!"
╧ЁштхЄ, ╠шЁ!


Added after 4 minutes:

I don't understand what is it:



Привет, Мир!
"Привет,"
Привет,



#include <QCoreApplication>
#include <QDebug>
#include <QTextStream>

QTextStream cin(stdin);
QTextStream cout(stdout);

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

QString input;
cin >> input;
QString russian = QString::fromUtf8(input.toUtf8());
qDebug() << russian;
cout << russian << endl;
cout.flush();

return a.exec();
}

toufic.dbouk
9th October 2013, 12:29
I don't know , nothing solved my problem which is exactly the same as yours.
too bad we cant find a solution for it.

8Observer8
27th November 2013, 07:07
I found the solution :)

main.cpp


#include <QCoreApplication>
#include <QTextCodec>
#include <QTextStream>

void sayhellow(const QString& s) {
QTextStream out(stdout);
#if defined(Q_WS_WIN)
out.setCodec("IBM866");
#endif
out << s << endl;
}

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QTextCodec * codec;
codec = QTextCodec::codecForName("utf-8");
QTextCodec::setCodecForCStrings(codec);
QTextCodec::setCodecForLocale(codec);
QTextCodec::setCodecForTr(codec);

QString s = "Привет, Мир!";
sayhellow(s);

return a.exec();
}

toufic.dbouk
27th November 2013, 18:20
@ 8Observer8

Thanks for trying again!
I am using Qt 5.1.
What Qt version are you using because setCodecForCStrings and setCodecForTr is not a member of QTextCodec.
Unfortunately this doesnt solve my problem.

#include <QCoreApplication>
#include <QTextCodec>
#include <QTextStream>

void sayhellow(const QString& s) {
QTextStream out(stdout);
out << s << endl;
}

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QTextCodec * codec;
codec = QTextCodec::codecForName("CP-1252");
QTextCodec::setCodecForLocale(codec);
QString s = "أدخل رمز";
sayhellow(s);

return a.exec();
}
still prints ???? ???

If anyone knows how to solve this problem please share with us.

8Observer8
27th November 2013, 18:31
I use Qt 4.8.5 :)

Added after 4 minutes:

toufic.dbouk, install Qt 4.8.5 for experiment )

toufic.dbouk
27th November 2013, 18:56
That is a hard request :p but i will try to install Qt 4.8.5 on another laptop and test it.
But since you already have that version installed try printing my string mentioned in the above post and see the result.

Good Luck.

8Observer8
28th November 2013, 05:42
@ 8Observer8

Thanks for trying again!
I am using Qt 5.1.
What Qt version are you using because setCodecForCStrings and setCodecForTr is not a member of QTextCodec.
Unfortunately this doesnt solve my problem.

#include <QCoreApplication>
#include <QTextCodec>
#include <QTextStream>

void sayhellow(const QString& s) {
QTextStream out(stdout);
out << s << endl;
}

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QTextCodec * codec;
codec = QTextCodec::codecForName("CP-1252");
QTextCodec::setCodecForLocale(codec);
QString s = "أدخل رمز";
sayhellow(s);

return a.exec();
}
still prints ???? ???

If anyone knows how to solve this problem please share with us.

Thank you very much. You are right! It doesn't work in Qt5! Please, somebody, help us!!!

ChrisW67
28th November 2013, 06:46
Back to first principles. What does this output:


#include <QtCore>
int main(int argc, char **argv)
{
QCoreApplication app(argc, argv);
QString test1 = QString::fromUtf8("\u0623\u062F\u062E\u0644 \u0631\u0645\u0632");
QString test2= QString::fromUtf8("\u041F\u0440\u0438\u0432\u0435\u0442 \u041C\u0438\u0440");
qDebug() << test1;
qDebug() << test2;
return 0;
}

For both Qt 4.8.5 and Qt 5.1.1 on Linux this puts the correct characters on the screen with a suitable font and bi-directional text rendering.

I know the input encoding is correct so "?" output is indicative of a lacking font or terminal that cannot cope with the encoded out.

toufic.dbouk
28th November 2013, 10:24
Trying this :
#include <QtCore>
int main()
{
QString test1 = QString::fromUtf8("\u0623\u062F\u062E\u0644 \u0631\u0645\u0632");
QString test2= QString::fromUtf8("\u041F\u0440\u0438\u0432\u0435\u0442 \u041C\u0438\u0440");
qDebug() << test1;
qDebug() << test2;
return 0;
}outputs : "?" with some warnings:C4566: character represented by universal-character-name '\u0623' cannot be represented in the current code page(1252)
So my system cant output such encoding ?

ChrisW67
28th November 2013, 12:10
Ignore the compiler warning for the time being.
Is the output from running the program all ? Characters, or just the arabic ones?
Is the result different if you set the console font to lucida and issue a "chcp 65001" command before running the program?

toufic.dbouk
28th November 2013, 13:02
Is the output from running the program all ? Characters, or just the arabic ones?
Just the Arabic ones, English characters outputs normal as alphabets ( if that's what you mean ).


Is the result different if you set the console font to lucida and issue a "chcp 65001" command before running the program?
How to do that ? should i compile and run the .pro or .cpp file from cmd ? can i open a Qt's cmd where it has Qt's libraries added to the cmd's environment?
Thanks for Your help.

ChrisW67
29th November 2013, 02:37
The first string is Arabic characters, the second is Cyrillic characters. This is what I see when I run it from a command prompt on Linux:


chrisw@newton /tmp/tt $ ./tt
"أدخل رمز"
"Привет Мир"


I just tried my example with VS 2010 on Win 7 and get a similar result to you: question marks all round. There seem to be several different problems:

The compiler is mangling the UTF8 input by trying to interpret it in the local 8-bit encoding. The warnings are issued in the process and the UTF-8 encoded text doesn't make it into the object file intact.
When I find a way to avoid that first mangling (by using hex directly) the result is still mangled on output.

I will try some more experiments over the weekend

8Observer8
29th November 2013, 05:51
ChrisW67, I want to understand that, but I don't. I read about UTF-8. How professionals solve this problem? Maybe you use "Internationalization with Qt":
- http://qt-project.org/doc/qt-5.0/qtdoc/internationalization.html
- http://qt-project.org/doc/qt-5.0/qtquick/qtquick-internationalization.html

How you do it in professional applications?

ChrisW67
29th November 2013, 08:34
I will try to explain what is happening. Start with this line:


QString test = QString::fromUtf8("\u041F\u0440\u0438\u0432\u0435\u0442 \u041C\u0438\u0440");

This is what happens on Linux with GCC.

The compiler sees the \u041f and inserts the UTF8 encoded version of the U+041f character (П) into the string. That is two bytes 0xD0 and 0x9F. It does this for the whole string. The result is a C-style string of bytes (in hex) that is the UTF8 encoded string:


D0 9F D1 80 D0 B8 D0 B2 D0 B5 D1 82 20 D0 9C D0 B8 D1 80

We feed that into fromUtf8() and we get a valid QString with the correct characters. When the Linux program executes, qDebug() outputs the QString correctly encoded for my UTF8 terminal and I get the expected characters on screen. The same goes for QLabel.


On Windows with MS VC++ (2010):

The compiler sees the \u041f and and tries to map the U+041f character (П) to the system's 8-bit Windows code page before putting it in the string. Unless your system code page is Windows-1251 there is not likely to be an equivalent of П and the compiler inserts ? as a placeholder for the character it could not convert. The compiler issues a warning:


warning C4566: character represented by universal-character-name '\u041F' cannot be represented in the current code page (1252)

It does this for the whole string. The result is a C-style string of bytes (in hex) that is not at all what you were expecting:


3F 3F 3F 3F 3F 3F 20 3F 3F 3F

We feed that into fromUtf8() and we get a valid QString but not the correct characters. qDebug() and QLabel cannot give the expected output now. This compiler behaviour seems to be the same regardless of what encoding the input file is or whether it has a UTF8 byte-order-mark or not.

If I change the line to:


QString test = QString::fromUtf8("\xD0\x9F\xD1\x80\xD0\xB8\xD0\xB2\xD0\xB5\xD1\x82\x 20\xD0\x9C\xD0\xB8\xD1\x80");

I have done the UTF8 encoding and avoid the compiler's attempt to map the characters to the Windows code page.
If I put that string on a QLabel I see the correct characters (font permitting): the data made it in.
The qDebug() output in the console is still wrong because the QString is being mapped (again) to the local 8-bit code page with the same "?" result.

I get Cyrillic output in a CMD console with:


#include <QApplication>
#include <QLabel>
#include <QDebug>
#include <QTextCodec>

int main(int argc, char **argv)
{
QApplication app(argc, argv);
QTextCodec::setCodecForLocale(QTextCodec::codecFor Name("utf8"));
QString test = QString::fromUtf8("\xD0\x9F\xD1\x80\xD0\xB8\xD0\xB2\xD0\xB5\xD1\x82\x 20\xD0\x9C\xD0\xB8\xD1\x80");
qDebug() << test.toLocal8Bit();
QLabel l(test);
l.show();
return app.exec();
}

If:

I run the program from a Windows CMD shell, and
set the shell font to "Lucida Console", and
I execute "chcp 65001" before I run the program.




Manually doing UTF8 encoding is not a good solution, and I do not yet have a nice solution.
There is a hotfix for VC 2010 http://stackoverflow.com/questions/6072342/how-to-use-utf8-character-arrays-in-c but it seems that did not make it into 2012 and 2013.

toufic.dbouk
29th November 2013, 14:57
That was a good explanation. Thanks.
Just a notice here, i tried printing
std::cout << "أدخل رمز"; on VS 2012, but i got ?? as output.
Saving the file with cp 1252 ( Western ) also results in ?? obviously.
Saving the file with UTF-8 cp 65001 results in some weird characters.
Saving the file with cp 1256 ( Arabic ) results in weird characters too.
That is all under Visual Studio 2012.

8Observer8
17th February 2014, 08:39
I was helped here: http://www.prog.org.ru/topic_26545_0.html

This is the solution:



#include <QCoreApplication>
#include <QDebug>
#include <QTextCodec>
#include <QTextStream>
#include <iostream>

QTextStream cin(stdin);
QTextStream cout(stdout);

int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);

QString string = "Привет, Мир!";
QTextCodec *codec = QTextCodec::codecForName("CP866");

// std::cout
QByteArray encodedString = codec->fromUnicode(string);
std::cout << "std::cout = " << encodedString.data() << std::endl;

// TextStream
cout.setCodec(codec);
cout << "TextStream = " << string << "\n";
cout.flush();

// qDebug
QTextCodec::setCodecForLocale(codec);
qDebug() << "qDebug() = " << string;

// Console r/w
cout << "Enter a text: ";
cout.flush();
cin.setCodec(codec);
QString inputStr;
// Read data from a console
cin >> inputStr;
cout << "From a console = " << inputStr << endl;
cout.flush();

return a.exec();
}