PDA

View Full Version : QTextStream cannot read £ characters?



KjellKod
10th August 2011, 20:56
By a mere chance I put in £ characters in a QString when I was unit testing write to/from text file.
I noticed that using QTextStream to read back the data lost £ characters.

Is it a locale thing? I've tried in on ubuntu only, not on Windows. Here's an example below where I show writing the text and then reading it back again. file.readAll() works like a charm but the other one doesn't



#include <iostream>
#include <QFile>
#include <QTextStream>
#include <QString>
#include <QIODevice>
namespace
{
const QString funky_text = " Yalla three:£[£££] \\£ three:#[###] \\# three:@[@@@] \\@ three:$[$$$] \\$ "; // with whitespace

void writeFile()
{
QFile file_out("/tmp/dummy.txt");
const QIODevice::OpenMode write_mode = (QIODevice::WriteOnly | QIODevice::Truncate);
if(!file_out.open(write_mode))
{
std::cerr << "Open (W) ERROR what's up:" << file_out.errorString().toStdString().c_str()<< std::endl << std::flush;
return;
}
QTextStream stream(&file_out);
stream << funky_text;
file_out.close();
}

void readVerify_1()
{
QFile file_in("/tmp/dummy.txt");
const QIODevice::OpenMode read_mode = (QIODevice::ReadOnly | QIODevice::Text);
if(!file_in.open(read_mode))
{
std::cerr << "Open (R) ERROR what's up:" << file_in.errorString().toStdString().c_str() << std::endl << std::flush;
return;
}

QString txt_in_1(file_in.readAll()); // conversion from QByteArray
file_in.close();
std::cout << "funky:" << funky_text.toStdString().c_str() << std::endl;
std::cout << "QString(funky, txt1) = " << QString::compare(txt_in_1, funky_text, Qt::CaseSensitive) << std::endl;
std::cout << "txt1: " << txt_in_1.toStdString().c_str() << std::endl;
}

void readVerify_2()
{
// read it again but by using QTextStream
QFile file_in("/tmp/dummy.txt");
const QIODevice::OpenMode read_mode = (QIODevice::ReadOnly | QIODevice::Text);
if(!file_in.open(read_mode))
{
std::cerr << "Open (R) ERROR what's up:" << file_in.errorString().toStdString().c_str() << std::endl << std::flush;
return;
}
QTextStream stream(&file_in);
QString txt_in_2 = stream.readAll();
file_in.close();


std::cout << "QString(funky, txt2) = " << QString::compare(txt_in_2, funky_text, Qt::CaseSensitive) << std::endl;
std::cout << "txt2:" << txt_in_2.toStdString().c_str() << std::endl;
}

void testQTextStream()
{
writeFile();
readVerify_1();
readVerify_2();

}
} // namespace



This would give the following output


funky: Yalla three:£[£££] \£ three:#[###] \# three:@[@@@] \@ three:$[$$$] \$
QString(funky, txt1) = 0
txt1: Yalla three:£[£££] \£ three:#[###] \# three:@[@@@] \@ three:$[$$$] \$
QString(funky, txt2) = -103
txt2: Yalla three:[] \ three:#[###] \# three:@[@@@] \@ three:$[$$$] \$


txt2: Yalla three:[] \ three:#[###] \# three:@[@@@] \@ three:$[$$$] \$
Obviously the £ signs are gone!


Anyone with ideas?
-- KjellKod

wysota
11th August 2011, 08:37
QString and QTextStream perform encoding coversion. If your source code is in UTF-8 and you're not using QString::fromUtf8() then you may have encoding issues. Try performing the conversion properly and see if it helps.

MarekR22
11th August 2011, 09:07
Just read: QTextStream::setCodec (http://doc.qt.nokia.com/latest/qtextstream.html#setCodec), QTextCodec (http://doc.qt.nokia.com/latest/qtextcodec.html) and choose coding you are planing to use for this file (for example UTF-8).

KjellKod
11th August 2011, 10:50
Thank you. Yes I'm aware of the QTextCodec and that you might have to set these. I was not aware that you HAD to set them every time!

I mean if I don't set it when I write it down, but have to set it when I read it back then QTextStream seems very counter intuitive. Too me this is a bug, probably not a Qt programming-bug but a design-logic bug nonetheless.

Thanks for the input though. My solution is just not to use QTextStream but QFile::readAll().

Cheers

wysota
11th August 2011, 13:11
It's not a problem with the codec but rather with non-ascii characters in your source code.

squidge
11th August 2011, 13:26
I'm suprised your code even compiles. Considering you are using non-ascii characters such as £, some complain.

KjellKod
11th August 2011, 22:15
It's not a problem with the codec but rather with non-ascii characters in your source code.

OK @wysota. I got it. I still think it's confusing that it works one way but not the other. To me it would make more sense if you had to put the codec in to both write it to file and then to read it back - not as it is now,. but I guess I just have to accept that is how QTextStream works :confused:

Thanks.

squidge
11th August 2011, 22:46
KjellKod: No, that wasn't an attempt to flame, just pointing out that some compilers will simply throw an error on your test text as they only support ASCII (0x00 - 0x7F)

Its my understanding that QTextStream has to "guess" the encoding format of the stream its reading, so the conversion is based on this guessed format - which may or may not be correct. If you fix the format there is no guessing. If you included the appropriate BOM (eg. 0xEF,0xBB,0xBF for UTF8) then there is also no need to guess (or state) the format.

wysota
12th August 2011, 12:05
OK @wysota. I got it. I still think it's confusing that it works one way but not the other. To me it would make more sense if you had to put the codec in to both write it to file and then to read it back - not as it is now,.
It's exactly like you say it should be. The "problem" is converting both ways doesn't have to give you the initial value and the codec itself can't detect such situation. It is your responsibility to provide input that's compliant with what the codec you use expects. By the way, read about QT_NO_CAST_FROM_ASCII in QString docs.

KjellKod
15th August 2011, 09:34
Maybe time to enlighten me? Why using QTextStream instead of ofstream or ifstream?

Let me clarify:
Of course QTextStream has lots of nifty features but in the simple case of reading up a human-readable text,. potentially writing it down again as human-readable (and why not reading it yet again) it seems to be lacking.

If you apply Codecs to QTextStream then if I understand it correctly strictly speaking non-ascii signs (but still very common) such as the brittish pound sign £ sign or the euro symbol will no longer be human-readable.

If on the other hand you would use the std::fstream (ofstream, ifstream to be specific) then there's no such problem whatsoever.

See the examle below:



#include <string>
#include <iostream>
#include <fstream>
#include <QString>

bool writeTextToFile(const QString &qfilename, const QString &msg, bool truncate_file)
{
std::string filename(qfilename.toStdString());
std::ofstream out;
std::ios_base::openmode mode = std::ios_base::out; // a little overkill since it's an Ofstream
truncate_file ? mode |= std::ios_base::trunc : mode |= std::ios_base::app;
out.open(filename.c_str(), mode);
if (!out.is_open())
{
std::cerr << "Unable to open file for writing:[" << filename.c_str() << "], std::ios_base state: [" << out.rdstate() << "]" << std::endl << std::flush;
return false;
}
out << msg.toStdString().c_str();
return true;
}


QString readTextFromFile(const QString &qfilename)
{
std::string filename(qfilename.toStdString());
std::ifstream in;
in.open(filename.c_str(),std::ios_base::in);
if(!in.is_open())
{
std::cerr << "WARNING: "Unable to open text file: %s << filename.c_str() << " std::ios_base state: " << in.rdstate() << std::endl << std::flush;
return QString("");
}

std::ostringstream oss;
oss << in.rdbuf();
std::string content(oss.str());
return QString(content.c_str());
}



Using the std::fstream libraries there's no issue. £ will be put into the file and still be human readable, and reading it back again and returning the text containing £ will also be OK.

wysota
15th August 2011, 14:58
Maybe time to enlighten me? Why using QTextStream instead of ofstream or ifstream?
Short and to the point: Unicode.