PDA

View Full Version : Read binary file and convert to QString



jaca
11th June 2008, 17:49
I have a binary file but I do not know what is written in it.
How do I read a binary file byte to byte and convert to String (QString)?
I do not know how this file is formatted, so it must convert this file to text.
Thanks.

wysota
11th June 2008, 20:33
So is it binary or text? Converting a binary file to a unicode based string is not a wise idea...

Conel
12th June 2008, 08:21
You can read the file content into QByteArray and then, if you know encoding (Local8Bit, UTF-8, UTF-16 etc), you can easily convert this QByteArray into QString using QString::from... methods.

This works only in case if the contents of the file is textual information. If file is 'real' binary file (like .jpg for example) then it is a bad idea to convert it into QString.

wysota
12th June 2008, 08:58
You can read the file content into QByteArray and then, if you know encoding (Local8Bit, UTF-8, UTF-16 etc), you can easily convert this QByteArray into QString using QString::from... methods.

Ok, but then you have a text file and you can read directly into QString implicitely converting from byte array using QFile::readAll() :) The question is about "binary" file. Does the author want to show hexadecimal representation of the file or textual one?

Conel
12th June 2008, 10:12
Ok, but then you have a text file and you can read directly into QString implicitely converting from byte array using QFile::readAll() :) The question is about "binary" file. Does the author want to show hexadecimal representation of the file or textual one?

Well, file is always sequence of bytes, so text file is also "binary". The problem is that text can be encoded with UTF-16, for example, and in this case one character of text will be represented by two bytes in file. And if you implicitely convert byte array obtained by QFile::readAll() into QString then you'll not get correct text in QString because as far as I remember implicit conversion assumes that byte array contains text encoded with ASCII.

So, actually to answer the question from jaca we need to know is the file he is going to read "pure binary" or "text binary".

If file always is "text binary" then the only thing we need to know - encoding. There are some ways to detect the encoding automatically (like MSWord does when opening .txt file).

If file is "pure binary" then it does not contain text and there is no sense to convert it contents into QString. In that case its contents can be shown as a sequence of hex symbols like 'FF 8C 0A 0D'.

I think that attaching example file would clarify the problem :-)

wysota
12th June 2008, 11:44
Well, file is always sequence of bytes, so text file is also "binary".
This is an academic talk. We all know what is known as "binary" and what is known as "text".


And if you implicitely convert byte array obtained by QFile::readAll() into QString then you'll not get correct text in QString because as far as I remember implicit conversion assumes that byte array contains text encoded with ASCII.
No, that is not true. By default the local 8 bit encoding is used which is correct in most cases. Otherwise you can simply set a proper codec prior to reading the string.

patrik08
12th June 2008, 12:34
I have a binary file but I do not know what is written in it.
How do I read a binary file byte to byte and convert to String (QString)?
I do not know how this file is formatted, so it must convert this file to text.
Thanks.

Each binary file having a signature.... / QDataStream having a nummer (on my case!)

This function was introduced in Qt 4.1.



QIODevice::peek(4).contains("PNG"); /* pngimage */
QIODevice::peek(4).contains("GIF"); /* gifimage */
QIODevice::peek(4).contains("MZ"); /* executable */

QIODevice::peek(4).contains("<"); /* xml.xhtml, or xmldialect and open as text */




replace QIODevice by QFile or what else...

Conel
12th June 2008, 12:55
This is an academic talk. We all know what is known as "binary" and what is known as "text".


No, that is not true. By default the local 8 bit encoding is used which is correct in most cases. Otherwise you can simply set a proper codec prior to reading the string.

jaca says about 'binary' file which can be somehow 'formatted'. That's why I concluded that actually he speaks about 'text' file which looks like binary.

BTW, from Qt help:

QString::QString ( const QByteArray & ba )
Constructs a string initialized with the byte array ba. The given byte array is converted to Unicode using fromAscii()

QString & QString::operator= ( const QByteArray & ba )
This is an overloaded member function, provided for convenience.
Assigns ba to this string. The byte array is converted to Unicode using the fromAscii() function.

(*sorry for 'academic' talks*)

wysota
12th June 2008, 15:05
Ok, let's continue the academic talks :)


void QTextCodec::setCodecForCStrings ( QTextCodec * codec ) [static]
Sets the codec used by QString to convert to and from const char * and QByteArrays. If the codec is 0 (the default), QString assumes Latin-1.

Also if you take a look at the implementation of fromAscii() you will see a confirmation that it by default converts from... latin-1 :)

jaca
12th June 2008, 18:47
I find the format of my file. It is a format segy. This format contains byte textual and binary.
Can anyone help me understand this format? Link below:
http://www.seg.org/SEGportalWEBproject/prod/SEG-Publications/Pub-Technical-Standards/Documents/seg_y_rev1.pdf
Thanks

Conel
13th June 2008, 17:08
Also if you take a look at the implementation of fromAscii() you will see a confirmation that it by default converts from... latin-1 :)

Yes, fromAscii == Latin-1 by default. But generally speaking Latin-1 != Local8Bit, and previously you said that


No, that is not true. By default the local 8 bit encoding is used which is correct in most cases.


I find the format of my file. It is a format segy. This format contains byte textual and binary.
Can anyone help me understand this format? Link below:
http://www.seg.org/SEGportalWEBproje...seg_y_rev1.pdf

According to spec this format is used in geophysical industry. The question now is why and what do you want to convert in QString?

It contains textual and binary parts, and offsets of these parts are clearly described in spec, so you may easily extract them. Textual parts encoded with EBCDIC encoding (correspondence between EBCDIC chars and ASCII chars is provided as a table at the end of the spec) so they can be converted to QStrings.

Is that what you need?

jaca
13th June 2008, 20:07
I'm trying to read the file and put the values in a QTableView or QTableWidget. I thought that converting to QString could better understand what it contains.
I need to read this file without necessarily convert to QString, but I have no idea of how to do that. Binary file is complicated. Read textual parts would be a good start.

jaca
13th June 2008, 23:05
I convert the textual parts using the command:
dd conv=ascii if=t291_migfin_b90.sgy of=t291_migfin_b90.txt
The textual parts I understood correctly, but I need the other values.
If someone has an interest in seeing the file t291_migfin_b90.sgy can send by e-mail. The file has 5.2M
Thanks