Results 1 to 5 of 5

Thread: Correct and full conversion to/from std::string.

  1. #1
    Join Date
    Nov 2010
    Posts
    97
    Thanks
    6
    Thanked 11 Times in 11 Posts
    Qt products
    Qt4
    Platforms
    Windows

    Default Correct and full conversion to/from std::string.

    I'm working on a project that we want to use Unicode and could end up in countries like Japan, etc... We want to use std::string for the underlying type that holds string data in the data layer (see http://stackoverflow.com/questions/4...w-up-the-world as to why). The problem is that I'm not completely sure which function pair (to/from) to use for this and be sure we're 100% compatible with anything the user might enter in the Qt layer.

    A look at to/fromStdString indicates that I'd have to use setCodecForCStrings. The documentation for that function though indicates that I wouldn't want to do this for things like Japanese. This is the set that I'd LIKE to use though. Does someone know enough to explain how I'd set this up if it's possible?

    The other option that looks like I could be pretty sure of it working is the to/fromUTF8 functions. Those would require a two step approach though so I'd prefer the other if possible.

    Is there anything I've missed?

    Repost from: http://stackoverflow.com/questions/4...from-stdstring
    This rude guy who doesn't want you to answer his questions.

    Note: An "expert" here is just someone that's posted a lot.

    "The fact of where you do the encapsulation is meaningless." - Qt Certified Developer and forum moderator

  2. #2
    Join Date
    Jul 2009
    Location
    Enschede, Netherlands
    Posts
    462
    Thanked 69 Times in 67 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Correct and full conversion to/from std::string.

    Well, your question seems to consist of a lot of sub-questions. And it will be tough to answer them all. The basic rule is to use UTF-8 encoding when serializing/storing string data. It adheres to the unicode standard and is guaranteed to be lossless, so you won't be losing data. Even though it is a two step process, I'd recommend you do all your serialization to and from UTF-8.

    QString is, as you state in one of your stack overflow posts, a unicode monster, but the common copying and editing actions are just as quick if not quicker than the std::string (I always find std to be a worrying thing to type...). Internally the string data is stored in UTF-16 encoding.

    There is indeed evidence that compiling Qt with wchar_t as built in type will compile correctly. It will mean that your complied libraries are binary incompatible with other Qt versions, but as long as you have it documented somewhere with your program, I guess there will be no issue. Also, the nature of shared librarying on windows doesn't really require you to keep the binary compatibility around. Even so, you will probably be happier with your data stored in UTF-8, where you don't have to take the byte order into account. Also, whenever your string data 'leaves' the QString, you will have to make sure you are using the correct locale/encoding.

    The fact of the matter is, that since QString is one of the few properly usable unicode compatible string implementations available. Especially when using a full Qt implementation, without any special stuff, Qt users will practically never run into this type of issues, because Qt solves the biggest portion of it for you (serializing to binary data, writing xml documents are all included). I am pretty certain that this is the reason this type of issue doesn't get a lot of replies.


    Added after 10 minutes:


    By the way, if your goal is to really be i18n-ready and you are planning on continuing the use of Qt, you should really look into using Qt's i18n and translation approach.
    Last edited by franz; 3rd January 2011 at 20:40.
    Horse sense is the thing that keeps horses from betting on people. --W.C. Fields

    Ask Smart Questions

  3. The following user says thank you to franz for this useful post:

    nroberts (3rd January 2011)

  4. #3
    Join Date
    Nov 2010
    Posts
    97
    Thanks
    6
    Thanked 11 Times in 11 Posts
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Correct and full conversion to/from std::string.

    Quote Originally Posted by franz View Post
    Even so, you will probably be happier with your data stored in UTF-8, where you don't have to take the byte order into account. Also, whenever your string data 'leaves' the QString, you will have to make sure you are using the correct locale/encoding.
    That's what it seemed to me. The only issue I think we're libel to run into is going to be wrt windows (at least older versions, not so sure now) require wide characters in order to access Unicode, so things like opening files and such is going to have to be handled differently.

    Can't wait until the standard dictates unicode strings!

    As to the two step process, someone on the other site suggested switching the codec for CStrings to UTF-8. Is this sufficient or would I run into problems? I guess the two part process isn't any big deal since I already have a templated "string_cast" function that I can wrap it up in but that was an idea for removing it and just using to/fromStdString.

    The fact of the matter is, that since QString is one of the few properly usable unicode compatible string implementations available. Especially when using a full Qt implementation, without any special stuff, Qt users will practically never run into this type of issues, because Qt solves the biggest portion of it for you (serializing to binary data, writing xml documents are all included). I am pretty certain that this is the reason this type of issue doesn't get a lot of replies.
    Yeah, thing is that we've already got the serialization and all that figured out with standard components and boost. We're pretty concerned about not getting tied down to any given UI framework. Too many times that's come up and bitten us in the ass.

    By the way, if your goal is to really be i18n-ready and you are planning on continuing the use of Qt, you should really look into using Qt's i18n and translation approach.
    Yeah, we'll be using that in the UI specific layers. Still not sure how to deal with the lower layers though.

    As to continuing to use Qt...I can't tell and try not to let that make a difference. Qt is a step up from just about everything else we've tried but there's also a lot of things I don't like about it. I'd prefer compiler errors when I connect signals and slots incorrectly for instance...or be able to connect lambda expressions to signals. Various bits of our stuff also get used in web programs and such too so I certainly can't get locked to the UI. I'm actually rather hoping that someday someone will make the C++ ui library I want. It would probably resemble GTKmm more than Qt in many ways and defer a lot of things to more standard components. To tell the truth the only reason we picked Qt over GTKmm was the fact that GTKmm doesn't obey ANY accessibility protocol in windows and thus can't be driven by test scripts like Qt can. If that were to be fixed we just might switch.
    This rude guy who doesn't want you to answer his questions.

    Note: An "expert" here is just someone that's posted a lot.

    "The fact of where you do the encapsulation is meaningless." - Qt Certified Developer and forum moderator

  5. #4
    Join Date
    Jul 2009
    Location
    Enschede, Netherlands
    Posts
    462
    Thanked 69 Times in 67 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: Correct and full conversion to/from std::string.

    Quote Originally Posted by nroberts View Post
    That's what it seemed to me. The only issue I think we're libel to run into is going to be wrt windows (at least older versions, not so sure now) require wide characters in order to access Unicode, so things like opening files and such is going to have to be handled differently.
    Well, here you should just use the W postfixed function names I guess. Windows uses an almost correct UTF-16 codec. The older versions use UCS-2 (yes, it's different). Just so you know. Good luck in finding the right solution for your needs in any case.

    Cheers.
    Horse sense is the thing that keeps horses from betting on people. --W.C. Fields

    Ask Smart Questions

  6. #5
    Join Date
    Mar 2009
    Location
    Brisbane, Australia
    Posts
    7,729
    Thanks
    13
    Thanked 1,610 Times in 1,537 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows
    Wiki edits
    17

    Default Re: Correct and full conversion to/from std::string.

    A look at to/fromStdString indicates that I'd have to use setCodecForCStrings. The documentation for that function though indicates that I wouldn't want to do this for things like Japanese. This is the set that I'd LIKE to use though. Does someone know enough to explain how I'd set this up if it's possible?
    The documentation for that function says that you need to be careful using encodings that do not preserve the ASCII range, and uses Shift-JIS encoding as an example. Since the encoding that you wish to use is UTF-8, which does preserve the ASCII range, you should have no issues in this regard. Japanese Unicode code points can happily be encoded in UTF-8.

  7. The following user says thank you to ChrisW67 for this useful post:

    nroberts (3rd January 2011)

Similar Threads

  1. Replies: 7
    Last Post: 28th December 2010, 20:27
  2. Correct string syntax
    By poporacer in forum Newbie
    Replies: 12
    Last Post: 30th October 2010, 04:47
  3. conversion between string to hex and vice versa
    By mohanakrishnan in forum Newbie
    Replies: 2
    Last Post: 5th December 2009, 11:25
  4. int to String Conversion
    By aj2903 in forum Qt Programming
    Replies: 4
    Last Post: 4th December 2009, 22:43
  5. Conversion Char Array to string
    By anafor2004 in forum Newbie
    Replies: 6
    Last Post: 6th May 2008, 14:35

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.