Results 1 to 12 of 12

Thread: Character encoding issues

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Join Date
    Dec 2006
    Posts
    849
    Qt products
    Qt4
    Platforms
    Unix/X11
    Thanks
    6
    Thanked 163 Times in 151 Posts

    Default Re: Character encoding issues

    no, I'd drop cin here.
    try something like
    Qt Code:
    1. QTextStream in(stdin);
    2. in.setLocale("eucTR");
    3. in >> searchedAuthor;
    To copy to clipboard, switch view to plain text mode 

    alternatively:
    Qt Code:
    1. cin >> author;
    2. searchedAuthor = QString::fromAscii(author.c_str()); // after setting the locale
    3. // ... if (see docs) QTextCodec::setCodecForCStrings() has been set
    To copy to clipboard, switch view to plain text mode 
    HTH

  2. The following user says thank you to caduel for this useful post:

    yagabey (15th December 2008)

  3. #2
    Join Date
    Sep 2006
    Posts
    6
    Qt products
    Qt3 Qt4
    Platforms
    Unix/X11 Windows
    Thanked 1 Time in 1 Post

    Default Re: Character encoding issues

    First of all:
    Qt always uses unicode for character encoding, so all the turkish, german, chinese letters etc. are represented.

    Microsoft windows always uses a country specific encoding,
    f.e:
    - Codepage 850 in western europe: http://de.wikipedia.org/wiki/Codepage_850
    - Codepage 857 for turkish: http://de.wikipedia.org/wiki/Codepage_857

    => So you have to convert the input from your specific encoding to unicode.

    Let's have a look at:
    http://doc.trolltech.com/4.4/qtextcodec.html
    There you can read:
    The supported encodings are:
    [...]
    # IBM 850
    # IBM 866
    # IBM 874
    [...]
    # ISO 8859-1 to 10

    So, unfortunately your needed encoding "IBM 857" is missing.
    I don't know if it works, but tryp ISO 8859-9:
    http://de.wikipedia.org/wiki/ISO_8859-9

    A simple test program goes like this (look at my comments in the code!):

    Qt Code:
    1. #include <QtCore>
    2. #include <QtGui>
    3.  
    4. #include <iostream>
    5. using namespace std;
    6.  
    7. int main(int argc, char** argv) {
    8. QApplication app(argc, argv);
    9.  
    10. char author[100];
    11.  
    12. cout<<"Authorname ?\n";
    13. cin>>author;
    14.  
    15. // Just for to get an idea of how the character codes are seen internally.
    16. //Please enter some of your arbitrary chars. In german I always use "äöü".
    17. for (int i=0; i<100; i++) {
    18. int c=author[i];
    19.  
    20. if (c<0) c+=256;
    21. printf("%d ", c);
    22. }
    23.  
    24. QByteArray encodedString=author;
    25. // QTextCodec *codec=QTextCodec::codecForName("IBM 850"); // western europe
    26. // QTextCodec *codec=QTextCodec::codecForName("IBM 850"); // turkish but will not work :-(
    27. QTextCodec *codec=QTextCodec::codecForName("ISO 8859-9"); // try it
    28.  
    29. if (!codec) {
    30. printf("Codec not supported.\n");
    31.  
    32. return 0;
    33. }
    34.  
    35. QString searchedAuthor=codec->toUnicode(encodedString);
    36.  
    37. // this message box gives you a validation if the encoding is interpreted correctly.
    38. // out put on the console does not show you anything, because it does not use unicode
    39. QMessageBox::information(NULL, "Ausgabe", searchedAuthor);
    40.  
    41. return 0;
    42. }
    To copy to clipboard, switch view to plain text mode 

    Have fun, Gérôme

  4. #3
    Join Date
    Dec 2007
    Location
    London
    Posts
    206
    Qt products
    Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android
    Thanks
    40

    Unhappy Re: Character encoding issues

    I couldnt make "c_str()" function work although I added <cstring>, <string> headers?

    Qt Code:
    1. searchedAuthor = QString::fromAscii(author.c_str());
    To copy to clipboard, switch view to plain text mode 

    That function returned true..(Codec supported..)
    Qt Code:
    1. if (!codec) {
    2. printf("Codec not supported.\n");
    3. return 0;
    4. }
    To copy to clipboard, switch view to plain text mode 

    I also tried:
    Qt Code:
    1. q = q->codecForName("ISO-8859-9");
    2. QTextCodec::setCodecForCStrings(q);
    To copy to clipboard, switch view to plain text mode 

    and

    Qt Code:
    1. QTextCodec *codec=QTextCodec::codecForName("ISO 8859-9");
    To copy to clipboard, switch view to plain text mode 

    in the message box, characters are not correct again..(it shows "İ" instead of "İ" )
    Qt Code:
    1. QMessageBox::information(NULL, "Ausgabe", searchedAuthor);
    To copy to clipboard, switch view to plain text mode 

    what else should i do?

  5. #4
    Join Date
    Dec 2006
    Posts
    849
    Qt products
    Qt4
    Platforms
    Unix/X11
    Thanks
    6
    Thanked 163 Times in 151 Posts

    Default Re: Character encoding issues

    I assumed that author is a std::string; if it is not (maybe it's just a char[20] or so...), just drop it.

  6. #5
    Join Date
    Dec 2007
    Location
    London
    Posts
    206
    Qt products
    Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android
    Thanks
    40

    Default Re: Character encoding issues

    Qt Code:
    1. searchedAuthor = QString::fromAscii(author);
    To copy to clipboard, switch view to plain text mode 

    didnt fix the problem

  7. #6
    Join Date
    Dec 2006
    Posts
    849
    Qt products
    Qt4
    Platforms
    Unix/X11
    Thanks
    6
    Thanked 163 Times in 151 Posts

    Default Re: Character encoding issues

    show us the (complete) code you are using

  8. #7
    Join Date
    Sep 2006
    Posts
    6
    Qt products
    Qt3 Qt4
    Platforms
    Unix/X11 Windows
    Thanked 1 Time in 1 Post

    Default Re: Character encoding issues

    Quote Originally Posted by yagabey View Post

    Qt Code:
    1. QTextCodec *codec=QTextCodec::codecForName("ISO 8859-9");
    To copy to clipboard, switch view to plain text mode 

    in the message box, characters are not correct again..(it shows "İ" instead of "İ" )
    Qt Code:
    1. QMessageBox::information(NULL, "Ausgabe", searchedAuthor);
    To copy to clipboard, switch view to plain text mode 

    what else should i do?
    So, the right codec is definetyly Codepage 857, which is not supported by Qt :-(

    You have to implement the conversion from CP 857 to ISO-8859-9 yourself.

    Just walk throught the byte array and convert chars>127:

    Qt Code:
    1. for (int i=0; i<auth_len; i++) {
    2. int c=author[i];
    3.  
    4. if (c<0) {
    5. c+=256;
    6.  
    7. switch(c) {
    8. // f.e. the g with a "bow" on it
    9. case 167: author[i]=240; break; // or maybe 240-256=-16, try it, I've got no turkish windows to test it!
    10. // I think you don't have to do it for all 128 chars, only for the few
    11. // turkish special chars you need
    12. }
    13. }
    14. }
    To copy to clipboard, switch view to plain text mode 

    G.

  9. #8
    Join Date
    Dec 2007
    Location
    London
    Posts
    206
    Qt products
    Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android
    Thanks
    40

    Default Re: Character encoding issues

    Here is a summary of the code:


    Qt Code:
    1. #include <QHttp>
    2. #include <QUrl>
    3. #include <QBuffer>
    4. #include <QFile>
    5. #include <QTextStream>
    6. #include <QXmlStreamReader>
    7. #include <QHttp>
    8. #include <QByteArray>
    9.  
    10.  
    11. class ColumnListing
    12. {
    13. Q_OBJECT
    14. public:
    15. ColumnListing();
    16.  
    17. public slots:
    18. void fetch();
    19. void readData(const QHttpResponseHeader &);
    20.  
    21. private:
    22. void parseXml();
    23.  
    24. QXmlStreamReader xml;
    25. QString currentTag;
    26. QString linkString;
    27. QString titleString;
    28. QString descriptionString;
    29. QString authorString;
    30. QString urltext;
    31. QString inputNews;
    32. QString searchedAuthor;
    33. QFile file;
    34. QFile *htmlFile;
    35. QHttp httpInstance;
    36. int subconnectionId;
    37. QUrl urlColumnToGo;
    38.  
    39. QHttp http;
    40. int connectionId;
    41.  
    42. };
    To copy to clipboard, switch view to plain text mode 

    constructor part:
    Qt Code:
    1. ColumnListing::ColumnListing(QWidget *parent)
    2. : QWidget(parent)
    3. {
    4.  
    5. connect(&http, SIGNAL(readyRead(const QHttpResponseHeader &)),
    6. this, SLOT(readData(const QHttpResponseHeader &)));
    7.  
    8. char *input;
    9. char *author;
    10. input = new char[100];
    11. author = new char[100];
    12.  
    13. cout<<"Newspaper ?\n";
    14. cin>>input;
    15. inputNews=input;//inputNews is global QString
    16.  
    17. cout<<"Author ?\n";
    18. cin>>author;
    19. searchedAuthor=author;//searchedAuthor is global QString
    20.  
    21. if (inputNews=="milliyet"){
    22. urltext.append("http://www.milliyet.com.tr/D/rss/rss/RssY.xml?ver=51");
    23. }
    24. else if (inputNews=="sabah"){
    25. urltext.append("http://www.sabah.com.tr/rss/yazarlar.xml");
    26. }
    27. else if (inputNews=="radikal"){
    28. urltext.append("http://www.radikal.com.tr/radikal_yazar.xml");
    29. }
    30. /*** More News sites here....***/
    31.  
    32. file.setFileName("output.txt");//file is global QFile
    33. if (!file.open(QFile::ReadWrite | QFile::Truncate))
    34. return;
    35.  
    36. htmlFile= new QFile("htmloutput.html");//htmlFile is global QFile
    37. if (!htmlFile->open(QFile::ReadWrite | QFile::Truncate))
    38. return;
    39. }
    To copy to clipboard, switch view to plain text mode 

    fetching:
    Qt Code:
    1. void ColumnListing::fetch()
    2. {
    3. xml.clear();
    4. QUrl url(urltext);
    5. http.setHost(url.host());
    6. connectionId = http.get(url.path());
    7. }
    To copy to clipboard, switch view to plain text mode 

    parsing rs doc:
    Qt Code:
    1. void ColumnListing::parseXml()
    2. {
    3. QTextStream inputText(&file);
    4.  
    5. while (!xml.atEnd()) {
    6. xml.readNext();
    7. if (xml.isStartElement()) {
    8. if (xml.name() == "item")
    9. linkString = xml.attributes().value("rss:about").toString();
    10. currentTag = xml.name().toString();
    11. } else if (xml.isEndElement()) {
    12. if (xml.name() == "item") {
    13.  
    14. if(authorString.contains(searchedAuthor,Qt::CaseInsensitive) ){
    15. inputText << titleString<< " "<<linkString <<" "<< descriptionString << authorString <<"\n";
    16.  
    17. QUrl url(linkString);
    18.  
    19. httpInstance.setHost(url.host());
    20. subconnectionId = http.get(url.path(),htmlFile);//write the author page into html file
    21. }
    22.  
    23. titleString.clear();
    24. linkString.clear();
    25. descriptionString.clear();
    26. authorString.clear();
    27.  
    28. }
    29.  
    30. } else if (xml.isCharacters() && !xml.isWhitespace()) {
    31. if (currentTag == "title"){
    32. titleString += xml.text().toString();
    33. }
    34. else if (currentTag == "link"){
    35. linkString += xml.text().toString();
    36. }
    37. else if (currentTag == "description"){
    38. descriptionString += xml.text().toString();
    39. }
    40. else if (currentTag == "dc:creator"){
    41. authorString += xml.text().toString();
    42. }
    43. }
    44. }
    45. if (xml.error() && xml.error() != QXmlStreamReader::PrematureEndOfDocumentError) {
    46. qWarning() << "XML ERROR:" << xml.lineNumber() << ": " << xml.errorString();
    47. http.abort();
    48. }
    49. }
    To copy to clipboard, switch view to plain text mode 

    main.cpp
    Qt Code:
    1. int main(int argc, char **argv)
    2. {
    3. QApplication app(argc, argv);
    4.  
    5. q = q->codecForName("ISO-8859-9");
    6. QTextCodec::setCodecForCStrings(q);
    7.  
    8. ColumnListing *columnlisting = new ColumnListing;
    9. columnlisting->fetch();
    10. return app.exec();
    11. }
    To copy to clipboard, switch view to plain text mode 

  10. #9
    Join Date
    Dec 2007
    Location
    London
    Posts
    206
    Qt products
    Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows Android
    Thanks
    40

    Default Re: Character encoding issues

    Ok at last I made it work:

    instead of:
    Qt Code:
    1. searchedAuthor = QString::fromAscii(author);
    To copy to clipboard, switch view to plain text mode 

    I used :
    Qt Code:
    1. searchedAuthor = QString::fromUtf8(author);
    To copy to clipboard, switch view to plain text mode 

    and now everything works perfectly, thank you all...

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Qt is a trademark of The Qt Company.