How to update curves with data from files? [Archive]

View Full Version : How to update curves with data from files?

lwz

1st December 2014, 03:12

We working on project related to oil industry. Here is the problem, we have to draw 100 curves in the plot, each curve's sample data is 10,000 points more or less.In our program we read data from file and update curves each time we scroll the plots. Normally, it cost 1 - 10 ms to read data from file and put it into a QPolygonF for one curve. Then it would takes about 1s to update all curves. It's too slow.
Do you have any suggestion?

Uwe

1st December 2014, 07:20

In our program we read data from file and update curves each time we scroll the plots. Normally, it cost 1 - 10 ms to read data from file and put it into a QPolygonF for one curve. Then it would takes about 1s to update all curves. It's too slow.
Do you have any suggestion?
Try to use something faster as f.e mmap or better read all data into memory only once.

100 * 10000 * 16 bytes doesn't sound like being a problem for contemporary computer systems - and for your type of application ( have written logging software for oil drilling myself ) you usually don't need 2 doubles for each sample.

Uwe

lwz

1st December 2014, 09:07

100 * 10000 * 16 bytes doesn't sound like being a problem for contemporary computer systems - and for your type of application ( have written logging software for oil drilling myself ) you usually don't need 2 doubles for each sample.

We do use mmap in our program. And read all is not a choice, too much data would lead to memory problem.

Reading data 2ms ,processing 2ms. 100 * (2 + 2) = 400 ms at least. Each time update the plot view would take this time. It's too slow.

What do you mean with "don't need 2 doubles for each sample" ? I don't get it. When we set a curve's sample with QPointF or QPoint. In oil drilling there is no way using QPoint Which would cause precision problem.

Uwe

1st December 2014, 11:26

And read all is not a choice, too much data would lead to memory problem.
Are you sure: how many RAM do you have and why do you believe, that 100 * 10000 * 16 will cause problems ?

What do you mean with "don't need 2 doubles for each sample" ?

In case of oil drilling samples are usually something like value vs. time or depth.

F.e time stamps are mostly ms elapsed since some startpoint, where int/short/float would be enough. Often timestamps are the same for all curves - no need to have duplicates for each curve.
When these timestamps are equidistant ( f.e value every 100ms ), they can be calculated on the fly - no need to waste any byte for them.
For depth or the values itself you often can use floats - when drawing the curve points are rounded to ints anyway and having doubles ususally doesn't matter.

Always keep in mind, that you are completely free about the data structure for your data. When you decide to use your own structure all that needs to be done is implementing a small bridge to bind it to the curve.

Uwe

lwz

2nd December 2014, 01:49

Are you sure: how many RAM do you have and why do you believe, that 100 * 10000 * 16 will cause problems ? Uwe
This is the least data enough to paint screen height curves, one curve's data would be 1000,000. Our software memory consumption is limited to 300 MB. Plus other memory cost, reading all can't work.

In case of oil drilling samples are usually something like value vs. time or depth.

F.e time stamps are mostly ms elapsed since some startpoint, where int/short/float would be enough. Often timestamps are the same for all curves - no need to have duplicates for each curve.
When these timestamps are equidistant ( f.e value every 100ms ), they can be calculated on the fly - no need to waste any byte for them. Uwe

This idea is new to us, we always keep depth and value. If you don't keep it in pair, how do you handle abnormal values, f.e depth is -999 which should be 199.25, value -112312131, something like that.
And these abnormal values are random. Different curves from the same file. What i mean is that if we keep a list of "depth" for each curve from the save file, while some curves' value on some point is invalid. In this case we remove this point.That is why we keep it in pair.

For depth or the values itself you often can use floats - when drawing the curve points are rounded to ints anyway and having doubles ususally doesn't matter. Uwe
we have used float, we also notice the curve points are rounded to ints. But we can't have the curve points rounded to ints , we have to show the curve original data when mouse hover over curves.
Do you have the same function? To do this , you have to save the original data eventually.

Always keep in mind, that you are completely free about the data structure for your data. When you decide to use your own structure all that needs to be done is implementing a small bridge to bind it to the curve.
Uwe
We never think about changing the data structure, what 's your data structure in your software , do you mind giving a little more tips? Or something more about this topic.

Uwe

2nd December 2014, 06:51

This is the least data enough to paint screen height curves, one curve's data would be 1000,000.
Before you wrote about 10000 samples each curve, what is 80K only ( when using floats ) for each curve ?

Our software memory consumption is limited to 300 MB.
Where is this limitation coming from: are you on an embedded device or is it just that you don't want to ?
If it is not a physical one you can and have to decide: performance vs. memory.

That is why we ...
I can't tell you what piece of information is important and how to reduce the amount of data in your case - but the idea should be understood.

We never think about changing the data structure, what 's your data structure in your software , do you mind giving a little more tips?
The message is: design your data structure according to the characteristics of your data.

F.e. if your curve is value vs. time and the timestamps are every 100ms:

class Data: public QwtSeriesData<QPointF>
{
public:
...
virtual size_t size() const
{
return m_values.size();
}

virtual QPointF sample( size_t index ) const
{
return QPointF( index * 100, m_values[index] );
}

private:
QVector<float> m_values;
};

lwz

3rd December 2014, 02:56

Where is this limitation coming from: are you on an embedded device or is it just that you don't want to ?
If it is not a physical one you can and have to decide: performance vs. memory.
Our client ..

The message is: design your data structure according to the characteristics of your data.

F.e. if your curve is value vs. time and the timestamps are every 100ms:

class Data: public QwtSeriesData<QPointF>
{
public:
...
virtual size_t size() const
{
return m_values.size();
}

virtual QPointF sample( size_t index ) const
{
return QPointF( index * 100, m_values[index] );
}

private:
QVector<float> m_values;
};

I get it ,but our mmap like this .

uchar * ptr;
float *data;
ptr = file->map(start + offset, end);
data = reinterpret_cast<float *>(ptr);

How to turn float * into QVector <float> quickly? I don't think it is more efficient whith 'while or for ' .

Uwe

3rd December 2014, 06:44

Our client ..
Then explain your client, that he has to decide: performance vs. memory.

How to turn float * into QVector <float> quickly?
My code snippet was about how to calculate coordinates on the fly - it was not intended as recommending to use QVector.
When your values are an array of floats implement your QwtSeriesData-bridge to return from there - no need to copy them into something else.

QwtSeriesData<QPointF> is an abstract API for how to iterate over points - it is not about storage. F.e QwtSyntheticPointData calculates all x and y coordinates temporarily only and therefore doesn't need any memory at all.

Uwe

lwz

3rd December 2014, 08:17

Hi , Uwe, this code below has some memory leak. But we don't know why, could you have a look at it ?

It seems that opening file would cause memory leak. Why is that? What do i miss?
Please help ,thanks!

{
float* data;

....
file = new QFile(fileName);
file->open(QIODevice::ReadOnly);
ptr = file->map(start + offset, end);
data = reinterpret_cast<float *>(ptr);
file->unmap(ptr);
file->close();

}

ã€€oh, my fault , it's another object that causes memory leaking.