PDA

View Full Version : Writing numbers to file fast! (Need Faster float to char implementation)



philwinder
4th December 2008, 15:15
Hi guys, I'd appreciate any insight you have on this.

What I am doing is capturing data from a data acquisition card working at 200kS/s and saving the data into a double array. Unfortunately I cannot save the data to file before the next lot of data comes in, so I have resorted to saving 30 seconds worth (the largest amount that would fit in my memory) to RAM and then saving it to disk. Its not ideal.

UPDATE: ________________________
I have found that most of the time is spent converting the various types into a qbytearray. So I think my question has changed, how can I convert a float (or at worst an int) into a qbytearray (or string) quicker than QString::number or QTextStream << does. Thanks.
________________________________

What I then need to do then is save it to a file. So far I have been using the various QString and QTextStream implementations of "float to string" to find that they take a huge amount of time. Much, much slower than the actual writing to the hard drive (which I thought would be the limiting factor).

So I decided to do some tests and I found that even writing text to a file (using qtextstream) is pretty slow (note: in a loop). For a float it takes 24 seconds, for a QByteArray with 9 characters it takes 15.7 seconds.

Now only appending the data to a QByteArray to make one huge array, then writing only takes 3 seconds for a char array but 48s for the float!!! And 24 seconds for an int.

So, down to business, does anyone know how I could write to a file? I'm willing to try anything, so long as its within my fairly limited programming knowledge.

wysota
4th December 2008, 16:31
First of all don't use a text stream. Instead operate on the file directly, it's faster this way. Second of all try to convert directly to the output format instead of going through many different transformations (like float -> string -> byte array). mmap()ing a file to memory might also help pretty much.

philwinder
4th December 2008, 16:44
try to convert directly to the output format instead of going through many different transformations (like float -> string -> byte array).

How would I go about doing that? By dissecting the current routines?

Well the fastest I have been able to do it is like this. Basically, multiply all the double data by 1000000 and save as an int. Then do an sprintf and then save that char into a QByte Array. Then iterate over all the data then at the end save to the file using QFile.write


double f = 0.54125432;
int num;
QByteArray temp1;
for( int x = 0; x < 200000*30; x++ )
{
num = (int)(f*1000000);
char buf[7];
::sprintf(buf, "%d", num);
temp1.append( buf );
temp1.append(",");
}
file.write(temp1);
file.flush();
file.close();

This gets it down to 3.3 seconds, but obviously its a bit annoying having the data in an "int" format. (note: sprintf'ing and double takes about 12 seconds)
Is that what you meant?
I'll also have a look at the mmap() function.
Thanks

wysota
4th December 2008, 17:12
Don't expect miracles... working on real values is slow, much slower than working on integers. If you want doubles, you have to cope with bigger delays. You can use an external thread to help you do your work while the other thread is waiting for the data to arrive from the device.

philwinder
4th December 2008, 17:55
Of course, I just wondered what was the quickest way to do it. I may even have a look at implementing my own int to string function see if I can shave any more clock cycles off. No doubt there will be lots of redundancy in the sprintf routines...

Using another thread was on my to do list, but I shyed away from it becuase it seemed to be teeming with potential problems like what happens when the file save thread starts to lag behind the data capture thread. And it looked complicated.

Hmmm. I may take another look since I have never used threads before and this is as good an excuse as any to have a go.

Thanks,

wysota
4th December 2008, 21:56
The quickest way to do it is not to convert floats to text :)

pgorszkowski
5th December 2008, 00:18
I am not sure is it what you want but please take a look on my example code and decide what should you use to speed you writing up:



#include <QTime>
#include <QFile>
#include <QByteArray>
#include <QDebug>

#include <iostream>
#include <fstream>

int main()
{
const int size = 200000;
{
std::ofstream out("test.txt");
QTime t;
t.start();
double f = 0.54125432;
int num;
for( int x = 0; x < size; x++ )
{
num = (int)(f*1000000);
out << num << ",";
}

out.close();
qDebug("Time elapsed: %d ms", t.elapsed());
}

{
QFile out("test.txt");
QTime t;
t.start();
double f = 0.54125432;
int num;
QByteArray temp1;
for( int x = 0; x < size; x++ )
{
num = (int)(f*1000000);
char buf[7];
::sprintf(buf, "%d", num);
temp1.append( buf );
temp1.append(",");
}
out.write(temp1);
out.flush();
out.close();
qDebug("Time elapsed: %d ms", t.elapsed());
}

{
std::ofstream out;
out.open("test.bin", std::ios_base::out | std::ios_base::binary);
QTime t;
t.start();
double f = 0.54125432;
double af[size];
for( int x = 0; x < size; x++ )
{
af[x] =f;
}
out.write((const char*)af, size*sizeof(double));
out.close();
qDebug("Time elapsed: %d ms", t.elapsed());

memset(af, 0, size);
std::ifstream in;
in.open("test.bin", std::ios_base::in | std::ios_base::binary);
in.read((char*)af, size*sizeof(double));
in.close();
for( int x = 0; x < 10; x++ )
{
qDebug() << x << ": " << af[x];
}
}

return 0;
}



the last part use double array as a buffer. There is also example code how to read your data from file.

drhex
5th December 2008, 11:01
The quickest way to do it is not to convert floats to text :)

Do you have any control over the application that is supposed to read the data you write, i.e. could you write the raw bits that make up the double values without converting them to text and have the reader accept them in that format?

philwinder
5th December 2008, 11:43
Do you have any control over the application that is supposed to read the data you write
Yes I have full control, since I am the only user. At the moment I usually import the data into scilab or matlab to do some math, so I can essentially accept anything that isnt encoded in some wierd encrypted format.

What were you thinking? Using one of the toHex() routines and write that? Since I dont think sprintf has a binary option?

philwinder
5th December 2008, 11:46
I am not sure is it what you want but please take a look on my example code and decide what should you use to speed you writing up

Thanks pgorszkowski, I will try it later to see if any of those options are quicker. I particually like the look of the std::ofstream functions.

drhex
5th December 2008, 11:58
I was thinking that you could cast a pointer to the first element of your double array to char * and feed it along with the size of the array in bytes to QIODevice::write

philwinder
5th December 2008, 14:29
drhex, pgorszkowski:
I have now tried your suggestions (you were suggesting the same thing) and they work fantastically. Writing the data to a binary format is many times faster than before.

The allocation of the array takes about 140ms, which I dont think will get any faster now (although I might try and use QVector to see how that compares - done takes 310 ms). However the more interesting bit is that a QFile::write takes 1 second, whereas the std::ofstream takes 3 seconds. I thought the std function would have been faster, but there you go. Maybe we really are to a point where hard drive speed is the limiting factor.

By the way, allocating an array then writing the whole array is faster than writing each individual element (1.2 seconds vs 3 seconds) which I guess one would instinctively assume.

Many thanks to drhex and pgorszkowski and as always, wysota. Cheers.

P.s.
For testing and future viewers, here is some cleaned up code with the fastest possible combination:

QTime timer;

QMessageBox::information(0,"","Ready");
QFile file;
file.setFileName("C:/test.txt");
file.open(QFile::WriteOnly);

double f = 0.541263266;

int size = 200000*30;
double * af = new double[size];

timer.start();
for( int x = 0; x < size; x++ )
af[x] = f;

int time = timer.elapsed();
QMessageBox::information(0,"","Elapsed Time: " + QString::number(time) + "ms" );
timer.restart();
file.write((const char*)af, size*sizeof(double));
file.flush();
file.close();
time = timer.elapsed();
QMessageBox::information(0,"","Elapsed Time: " + QString::number(time) + "ms" );
delete [] af;

d_stranz
10th December 2008, 05:04
With these performance statistics on 30 seconds worth of data (1.2 s), it looks like you might be able to write your data in real time, as opposed to storing 30 seconds worth and then writing. Also would give you some protection against data loss if something fails - can you afford to lose 30 seconds worth of data? Writing in real time, you lose at most a second or so.

Not that you should think in units of seconds in the first place - if you are sampling a continuous data stream at 200kS/s, then -the smallest- time interval at which you can write in real time without data loss would be better than buffering for some human-defined period like 1, 5, or 30 seconds before writing.

This is where you might make use of a memory-mapped file. If you know how long you will be acquiring data in total (say 10 minutes), then you allocate the memory-mapped file to a size of 200000 * 600 * sizeof ( double ) and simply acquire to that "array" as if it were the double * array you allocate in your example code. The file system takes care of flushing it to disk at appropriate times (or you can do it manually every second or whatever if you don't trust the file system to do it for you).

With memory-mapping, you can completely ignore all the details of file I/O except those involved in originally opening and then closing the memory-mapped file. Everything else just looks like filling up an array.

David

drhex
10th December 2008, 18:46
I notice that you tried earlier to multiply your aquired values with 1000000 to get an integer. How much precision does your data have? Using float rather than integer might be sufficient, letting you store twice as many values or write them faster.

philwinder
16th March 2009, 10:19
Hi guys,
I have been using the previous routine to save data now for quite a few months, and everything seems to be fine in Real time.

@d_stranz: Ok, but the reason I had to move to a file anyway was because my system could only handle a minutes worth of data in memory. Since there are 4 channels, sampling at 200kS/s = 800kS/s * 32 bit for float ~= 24.4mB/s. One minute = 1.5 Gig. A lot of data.

Also, I cannot find a qt related class that deals with it, so would I have to write my own?

@drhex: Thats was just because I thought that saving an int would be smaller/quicker. I now know that was wrong. The final piece of code is what I have been using. (Or something similar)