PDA

View Full Version : How to write and read from binary files



Momergil
22nd November 2011, 13:19
Hello!

I'm planning a software that will plot a graph using Qwt or OpenGL (yet not decided) and the data of the graph would be catch by reading a .bin file, but this is the first time I deal with binary files so I'm a "little" bit ignorant on how do I proceed.

Before writing the actual software, I'm developing a second one to help me in my process of learning how to deal with binary files. This software has two QTextEdit side-by-side and have a "instant connection" between them (i.e. if one of them is edited, the other immediately should change base on what is written in the changed QTextEdit). But one of the QTextEdit is supposed to be connected with a .bin file and originally show what is written in the file in binary code (so one of the QTextEdit will show only a set of 0000 and 1111 and so on), while the other should be its translator, i.e. should show the binary code of the first QTextEdit in ASCII characters.

Now I already noticed that to work in this project I will use QDataStream and QFile. But the problem is that I'm not familiar with QDataStream::readBytes() and readRayData(), and nether with the write functions. So for example in the beginning, when the QFile is opened and should show in on of the QTextEdit the bin data contained in the .bin file, I don't know which function to use in the place of QTextStream::readAll().

Could somebody help me in this project? Which function do I use to read a binary code and show them in ASCII (e.g. 01000001 shows 65), how can I give a readAll() and how can I write an ASCII character and the software translate it to binary? And how can I make sure that the user will only write "0"s or "1"s in the QTextEdit for binary data, calling a warning if the user attempts to write a non-binary caracter? (i.e. try to put an ASCII char)


Thanks!!


Momergil



Note: It's something like this I want to do: http://www.roubaixinteractive.com/PlayGround/Binary_Conversion/Binary_To_Text.asp

mvuori
22nd November 2011, 16:52
Even though people talk about "binary files", nobody in his/her right mind would ever think of their contents as being ones and zeros, but what ever expression some data structure or number format might require. Do yourself a favor and stop thinking in ones and zeros and think like a programmer should -- at the highest abstraction level that your tool supports.
Saving the files in some textual format would make them something like thousand times easier to debug when you change your code, data structures and file contents.

marcvanriet
22nd November 2011, 23:53
Hi,

Don't do the qtextedit thing, it's just over-complicated and irrelevant.

How do you know what is in your .bin file ? Is it generated by some datalogger or so, and do you have documentation about that ?

First you should know how to interpret the .bin file, and then read it byte for byte and manipulate the bytes to turn them into usefull information.

Regards,
Marc

ChrisW67
23rd November 2011, 03:27
If the files you are using as input are just being treated as a sequence of bytes then you do not need QDataStream, which is more aimed at providing an object serialisation system for Qt applications, you can just use the QIODevice interface directly.

QString::number can convert a byte to its string representation in any base you like between 2 and 36.
QString::toLong can do the opposite magic.

Ultimately I am not sure how useful your experiment will be in getting to your end goal of graphing a data stream from another source.

Momergil
23rd November 2011, 16:23
Hello!

Thanks for the answers.


Even though people talk about "binary files", nobody in his/her right mind would ever think of their contents as being ones and zeros, but what ever expression some data structure or number format might require. Do yourself a favor and stop thinking in ones and zeros and think like a programmer should -- at the highest abstraction level that your tool supports.
Saving the files in some textual format would make them something like thousand times easier to debug when you change your code, data structures and file contents.

Sorry, but I'm working for a industry and they choose to use binary files (I imagine for security reasons, since I read bin files are more difficult to fall under reverse engeneering), so there is nothing I can do about it; I must learn how to convert text to bin and how to read bin and transform it into ASCII.


First you should know how to interpret the .bin file, and then read it byte for byte and manipulate the bytes to turn them into usefull information.

Regards,
Marc

yep, that is what I'm trying to learn! Since my software will open a .bin file and must read it to find the data it must show, I must learn how to read .bin files, and that is my question and the purpose of my software.


If the files you are using as input are just being treated as a sequence of bytes then you do not need QDataStream, which is more aimed at providing an object serialisation system for Qt applications, you can just use the QIODevice interface directly.

QString::number can convert a byte to its string representation in any base you like between 2 and 36.
QString::toLong can do the opposite magic.

Ultimately I am not sure how useful your experiment will be in getting to your end goal of graphing a data stream from another source.

Thanks, Chris, for the functions. I only didn't understand the explanation regarding QDataStream. What do you mean by "aimed at providing an object serialisation system for Qt applications"?



Thanks!

Added after 40 minutes:


QString::number can convert a byte to its string representation in any base you like between 2 and 36.
QString::toLong can do the opposite magic.

Hello Chris,

are you sure these are the functions and your explanation is correct? I was able to use QString::toLong to convert from byte to ASCII, but my efforts to use QString::number are being useless.



QString text;
bool ok;
text = ui->Text->toPlainText().toLong(&ok,2);
qDebug() << "Text to long:" << text;

ui->ASCII->setText(text);

marcvanriet
23rd November 2011, 18:20
MVR : First you should know how to interpret the .bin file, and then read it byte for byte and manipulate the bytes to turn them into usefull information.

yep, that is what I'm trying to learn! Since my software will open a .bin file and must read it to find the data it must show, I must learn how to read .bin files, and that is my question and the purpose of my software.


For reading the file, use a QFile and the read() method. There are version of the read() method to put the data in a array of characters, or in a QByteArray.

Then you must know how the information is encoded, and use this knowledge to transform the array of bytes. If there is for instance a 16-bit value in it, you must know which byte is first (LSB or MSB) and then add them accordingly (A*256 + B or A + B*256). If the data is generated by a desktop PC, you can do it dirty and just type-cast. E.g. float fTemperature = *(float *)pDataPointer

Regards,
Marc

ChrisW67
24th November 2011, 00:21
Sorry, but I'm working for a industry and they choose to use binary files (I imagine for security reasons, since I read bin files are more difficult to fall under reverse engeneering), so there is nothing I can do about it; I must learn how to convert text to bin and how to read bin and transform it into ASCII.
Security reasons? Rubbish, unless the binary files are encrypted and that is unlikely. The data stream is most likely binary, i.e. not human readable, because that is a convenient, compact form for the source to emit. You need to know what the bytes in the file represent, how to group them into larger units (ints, floats, etc.), and into larger structures. Being able to dump the raw byte stream in a human readable form may help understand what is in the file, but it unlikely to be the end goal.

For example, a (fictitious) data acquisition unit might gather samples from n inputs and output a variable length data structure like this for each sample set:


struct sample {
ulong byteOrder, // 32-bits value 0xFFFE0000
char senderName[16], // ASCII chars only, space padding, no termination NUL
ulong sequence, // 32-bits
ushort n, // 16 bits
double samples[n], // 64-bits per sample, IEE754 format
ulong crc // CRC-32, excludes byteOrder and crc fields
};

Your problem is how to interpret that at the receiving end.


Thanks, Chris, for the functions. I only didn't understand the explanation regarding QDataStream. What do you mean by "aimed at providing an object serialisation system for Qt applications"?
I mean it is a way to write binary data stream in one Qt program that can be read by another Qt program using any version of Qt (within limits set by the writer), with any native machine endianness (http://en.wikipedia.org/wiki/Endianness), and reconstruct the objects (Qt or user) at the other end. That applies to simple types like int or char as well as to complex structures like QString, QList<QPair<QString,QPixmap> > etc.

You can use it on arbitrary data streams but it may be easier not to depending on the structures involved.



are you sure these are the functions and your explanation is correct?
Yes

I was able to use QString::toLong to convert from byte to ASCII
Are you sure? Read the documentation for QString::toLong()... Hint: it doesn't return a QString.

, but my efforts to use QString::number are being useless.
Probably because you seem very confused.
Read the documentation for QString::number()... Hint, it doesn't convert strings into numbers.



bool ok(false);
QString strBinary("00000010"); // That's 2 in 8 binary bits.
// This is a human-readable STRING representation of a number. This is not what a "binary" file contains.

long result = strBinary.toLong(&ok, 2);
qDebug() << result << ok;
// Output: 2 true
// Result is the number represented by the binary digits above, and the conversion was successful.
// The output is a human-readable decimal representation of that number.

qDebug() << QString::number(result, 2);
// Output: "10"
// This is a human-readable STRING representing the number (minus leading zeros)

Momergil
24th November 2011, 16:17
Hello!

First, thanks once again for the answers.

marcvanriet, I'm doing something like this right now, but first (i.e. before working with encoded .bin files and so forth), it seems more simple to me to work with direct chars, i.e. what I'm doing with the two QTextEdit in the Qt software. So I write binary code in one and the correspondent ASCII should appear in the second. Later I pass that to the bin file (despite, in fact, my code for doing so is actually ready, I'm just not searously using it specially because I still didn't define the binary encoding). Anyway, I'm looking now for more direct reading.

now about the details you mentioned, well that seriously gave me the impression I actually don't understand what a bin file is! In my simple reasearch I understood that a bin file is like a file but instead of numbers and letters, it contains series of 0 and 1 and ends with ".bin". Nothing special. So everything that I should do is to open and read it as if it was a .txt file, but instead of directly using the text, I should create a reader to divide the 0s and 1s into characters to be used. I'm wrong?

Chris:


Security reasons? Rubbish, unless the binary files are encrypted and that is unlikely.
Yeah, that's it: http://www.iwriteiam.nl/Ha_HTCABFF.html

Now, Christ, about those two functions, yep, that's what I did. I only have the impression that actually you didn't understand what I mean by reading a string; I created a confusion between the FUNCTION returning a QString (like: QString QString::xx()) and I reading one. So to make myself clearer, I will create an example:

Lets supose that I want to read the following line of bits:

010011010110000101110010011101000110100101101110

What is written on it is "Martin". So I would like to have a function that reads the first 8 numbers, 01001101, which is the "M", and than show it in the QTextEdit "M". But if I write the complete serie of 0 and 1, it will show "Martin". Much like it is done in this wepage: http://www.roubaixinteractive.com/PlayGround/Binary_Conversion/Binary_To_Text.asp

So in this sense, I must read a binary number (01001101) and show its ASCII meaning (M). Or else the contrary: I write the ASCII character (M) and it will show in the other side its binary value (01001101).

Of course that in the actual software I would do that process! I would read the binary numbers and work them directly. But I must know what is in the given part of the bin file to know if I put here or there in the software, and to learn that I'm writing this software.

Unfortunately I can't only use that wepage that, in a first moment, would do the job. After all, in one moment my software will have to actually convert the binaries for something more human-readable (e.g. a given "001100010011001100110000", aka 130, will be used in a method where this 130 is a value for a graphic coordinate, or a given "010001010100001101000111", aka ECG, will be displayed in a part of the interface as title for that graph). So I must know how to do that reading.


I hope you understood =]

Momergil

ChrisW67
24th November 2011, 23:26
I understood what you are trying to do, and you already have all the pieces to do that:


#include <QtCore>
#include <QDebug>

int main(int argc, char *argv[])
{
QCoreApplication app(argc, argv);

QString input("010011010110000101110010011101000110100101101110");

// Strip anything that is not a '0' or '1' if you want this to be even slightly reliable.
// Assumes multiple of eight characters

QString output;
for (int offset = 0; offset < input.length(); offset += 8) {
bool ok(false);
const char c = static_cast<const char>(input.mid(offset, 8).toLong(&ok, 2));
if (ok)
output.append(c);
else
qDebug() << "Conversion failed for" << input.mid(offset, 8);
}

qDebug() << output;
}


Seriously though, you shouldn't need to be inventing a tool to do the job of od on Linux and other UNIXes, or any half decent programmer's editor on Windows (e.g Notepad++ (http://notepad-plus-plus.org/) and the hex editor plugin (http://sourceforge.net/projects/npp-plugins/)) just to see what bytes are in the input data.

marcvanriet
28th November 2011, 11:58
now about the details you mentioned, well that seriously gave me the impression I actually don't understand what a bin file is! In my simple reasearch I understood that a bin file is like a file but instead of numbers and letters, it contains series of 0 and 1 and ends with ".bin". Nothing special. So everything that I should do is to open and read it as if it was a .txt file, but instead of directly using the text, I should create a reader to divide the 0s and 1s into characters to be used. I'm wrong?

Yes, you are wrong.

A binary file is a series of bytes, which you could e.g. read into an array of unsigned char or a QByteArray. File extension doesn't matter.

Other data (floats, integers, strings, ...) are encoded as a series of bytes. Only booleans could be combined to have more than 1 in a byte. You never ever try to process a binary file like a stream of 0's and 1's (unless maybe for educational purposes).

Regards,
Marc