View Full Version : Unicode/ASCII characters in QTextStream
yren
23rd November 2009, 17:31
When I read file contents into a QString like following:
QFile myFile(test.u);
myFile.open(QIODevice::ReadOnly);
QTextStream ts(&myFile);
QString strContent = ts.ReadAll();
Is QTextStream smart enough to determine if the file content is Unicode or ASCII? Since QString holds data internally in Unicode format, I would get a totally different string if it cannot distinguish.
Thanks!
squidge
23rd November 2009, 18:37
From the docs:
void QTextStream::setAutoDetectUnicode ( bool enabled )
If enabled is true, QTextStream will attempt to detect Unicode encoding by peeking into the stream data to see if it can find the UTF-16 or UTF-32 BOM (Byte Order Mark). If this mark is found, QTextStream will replace the current codec with the UTF codec.
yren
23rd November 2009, 19:09
From the docs:
void QTextStream::setAutoDetectUnicode ( bool enabled )
If enabled is true, QTextStream will attempt to detect Unicode encoding by peeking into the stream data to see if it can find the UTF-16 or UTF-32 BOM (Byte Order Mark). If this mark is found, QTextStream will replace the current codec with the UTF codec.
Thanks! How does the auto-detect know if it is Unicode file? Does Unicode file have some kind of header?
squidge
23rd November 2009, 19:25
It has an optional header. If the header is present and encoded with UTF-16 for example, the file will start with the bytes 0xFF,0xFE or 0xFE,0FF depending on byte order. The same happens for UTF-32, although there are 4 bytes instead of 2.
If the header is missing, I don't see how it could be recognised easily. If you know its not a binary file, then any character >127 or certain characters below 32 could assume unicode.
Powered by vBulletin® Version 4.2.5 Copyright © 2024 vBulletin Solutions Inc. All rights reserved.