drhex
24th August 2008, 19:08
I'm writing an application that manages files in large file system hierarcies. Until now, I have stored filenames in QStrings and browsed directories using QDir and QFileInfo.
But consider this small example program which simply loops over the file system entries in the current directory and prints whether they are "files" or not:
#include <QApplication>
#include <QDir>
#include <QFileInfo>
int main(int argc, char *argv[])
{
QApplication app(argc, argv);
QDir d(".");
foreach (QFileInfo inf, d.entryInfoList(QDir::AllEntries|QDir::NoDotAndDot Dot|QDir::System))
{
qDebug("Entry %s: isFile: %s",
qPrintable(inf.fileName()),
inf.isFile() ? "yes":"no");
}
return 0;
}
I live in Sweden, where we have a few extra characters in the alphabet in addition to A-Z. Some people use iso8859-1 where those characters are encoded as single bytes in the 128-255 range. Others use utf-8 where the "funny" characters have multi-byte sequences. The files in my "large hierarcies" come from many sources and I cannot guarantee that the filenames have a consistent encoding.
If encoding inconsistencies only resulted in some filenames looking weird on the screen, that would be OK, but my experience is worse: non-conforming files become invisible in Qt!
I've tested the above program on a Linux system with LANG/LC_CTYPE set to "en_US.UTF-8" and with some files in the current directory copied from a Windows VFAT system (which used iso8859 to encode the non-ascii chars in the filenames).
Browsing the filesystem with QDir::AllEntries, those copied files are not found at all. If I add QDir::System (as in the example above), they are found but with isFile() returning false and the offending characters removed meaning I can't use the fileName() to e.g. open() the files.
Is there any way out besides using the OS's native functions to browse directories and storing filenames in a QByteArray?
But consider this small example program which simply loops over the file system entries in the current directory and prints whether they are "files" or not:
#include <QApplication>
#include <QDir>
#include <QFileInfo>
int main(int argc, char *argv[])
{
QApplication app(argc, argv);
QDir d(".");
foreach (QFileInfo inf, d.entryInfoList(QDir::AllEntries|QDir::NoDotAndDot Dot|QDir::System))
{
qDebug("Entry %s: isFile: %s",
qPrintable(inf.fileName()),
inf.isFile() ? "yes":"no");
}
return 0;
}
I live in Sweden, where we have a few extra characters in the alphabet in addition to A-Z. Some people use iso8859-1 where those characters are encoded as single bytes in the 128-255 range. Others use utf-8 where the "funny" characters have multi-byte sequences. The files in my "large hierarcies" come from many sources and I cannot guarantee that the filenames have a consistent encoding.
If encoding inconsistencies only resulted in some filenames looking weird on the screen, that would be OK, but my experience is worse: non-conforming files become invisible in Qt!
I've tested the above program on a Linux system with LANG/LC_CTYPE set to "en_US.UTF-8" and with some files in the current directory copied from a Windows VFAT system (which used iso8859 to encode the non-ascii chars in the filenames).
Browsing the filesystem with QDir::AllEntries, those copied files are not found at all. If I add QDir::System (as in the example above), they are found but with isFile() returning false and the offending characters removed meaning I can't use the fileName() to e.g. open() the files.
Is there any way out besides using the OS's native functions to browse directories and storing filenames in a QByteArray?