PDA

View Full Version : reading ANSI ( Windows-1252) file with Cyrillic content and to encode it to UTF 8



bastrijan
21st May 2012, 13:36
Hello to everyone,
I've got one problem.

I read one text file that is encoded with ANSI (i.e Windows-1252). The important thing is that the file has Cyrillic content. Because of that the Cyrillic part of the text i get it as question marks (?????).
I want to encode the content with utf8 encoding, in order to get the Cyrillic text back to normal before I put the content into database.

I use QFile to open the file and QTextStream to read the file content

Any suggestions?

Best Regards,
Bastrijan

ChrisW67
22nd May 2012, 01:28
Code Page 1252 (http://en.wikipedia.org/wiki/Windows-1252) is the standard Windows western European code page. I think you want 1251 (http://en.wikipedia.org/wiki/Windows-1251) for Cyrillic.

You should call QTextStream::setCodec() with a QTextCodec generated with QTextCodec::codecForName("windows-1251") (or "CP1251")

Have a look at the Codecs example