PDA

View Full Version : Displaying hughe data set efficiently



slanina
8th May 2013, 08:58
Hello,

I am looking at QWT to display a large data set of curves representing sensor data collected over a longer time period.
Curves should be stacked one over the other (Y axis) and X axis represents time. Person looking at the graph can scroll by X axis in time and look at the data. Person can Zoom into the data set where Zoom is done only on X axis (all sensors on Y axis should remain visible).

Upon loading the data set, initial graph can represent the whole dataset but only representative points - for example - if there are 1000 points in the data set and graph is 100 pixels wide, we can divide 1000 by 100 and from 10 data points maximum value is taken as representative value for that pixel. Alternatively if time on X axis goes to for example 10000 initial graph can show only from 0-1000 and then user can scroll the graph on X axis...this is also a possibility. Whatever gives more performance (see below). Zooming into more detail is also a requirement in either case.

When user scrolls up/down (along Y axis) sensors and their representative graphs should wrap around. So if first order was 1,2,3,4 then when user scrolls it would be 4,1,2,3 then again 3,4,1,2 and so on. Can this be achieved without expensive replots?

Lastly, user should be able to draw an object around certain data points (bounding rectangle that can span multiple curves) and this rectangle should be visible on all zoom levels (expanding/contracting as needed) and its position should be restored when data set is loaded again. I am not sure how to do this and to tie such object to specific datapoints spanning multiple curves on the same plot.

I have read the thread from 2010:
http://www.qtcentre.org/threads/28435-Implimentation-advice-for-a-large-data-plotting-application

and I have implemented a sample application based on advice there but when I run it where each graph has 1 milion points and there are 8 of them it takes a long time for them to be displayed. Zooming on X axis only takes a long time for it to be displayed as well.

In reality I expect to have GBs of data to be displayed - either chopped into smaller files (3-4GB chunks) or loaded in advance in background when user approaches the boundary in X axis where currently loaded data ends; but swap should be quick....

Do you have any advice how this could be achieved and to work quick?


Sample code is here based on simpleplot example. I add 100000 data points and it is unacceptable. 10000 it is OK. But when I put 1 milion or more.....:(



int main( int argc, char **argv )
{
QApplication a( argc, argv );

QwtPlot plot;
plot.setTitle( "Sensor Demo" );
plot.setCanvasBackground( Qt::white );
plot.setAxisScale( QwtPlot::yLeft, 0.0, 80.0 );
plot.insertLegend( new QwtLegend() );

QwtPlotGrid *grid = new QwtPlotGrid();
grid->attach( &plot );

QwtPlotCurve *curve;


qsrand(QDateTime::currentDateTime().currentMSecsSi nceEpoch());

plot.resize( 600, 400 );

for(int j=0;j<8;++j)
{

qDebug() << " Setting curve: " << j;

curve = new QwtPlotCurve();
curve->setTitle( QString("Sensor %1").arg(j+1) );
curve->setPen( Qt::blue, 1 ),
curve->setRenderHint( QwtPlotItem::RenderAntialiased, true );

QwtSymbol *symbol = new QwtSymbol( QwtSymbol::Ellipse,
QBrush( Qt::yellow ), QPen( Qt::red, 2 ), QSize( 8, 8 ) );
//curve->setSymbol( symbol );

QPolygonF points;

for(int i=0;i<100000;++i)
{
QPointF p(i, 10*j+qrand()%10);
points.append(p);
}

curve->setSamples( points );

qDebug() << " Added sample points";

QwtWeedingCurveFitter *fitter = new QwtWeedingCurveFitter();
fitter->setTolerance(10);
curve->setCurveFitter(fitter);
curve->setCurveAttribute(QwtPlotCurve::Fitted, true);
curve->setPaintAttribute(QwtPlotCurve::ClipPolygons, true);

qDebug() << "Added curve fitter";

curve->attach( &plot );

qDebug() << "Attached curve.";

}

MyZoomer *zoom = new MyZoomer(plot.canvas());
zoom->setRubberBandPen(QPen(Qt::black, 2, Qt::DotLine));
zoom->setTrackerPen(QPen(Qt::black));

plot.show();

zoom->setZoomBase();

return a.exec();
}




#ifndef MYZOOMER_H
#define MYZOOMER_H

#include <qwt_plot_zoomer.h>

class MyZoomer : public QwtPlotZoomer
{
Q_OBJECT
public:
explicit MyZoomer(QWidget *canvas = 0);

void zoom( const QRectF & rect );


};



#endif // MYZOOMER_H




#include "myzoomer.h"

MyZoomer::MyZoomer(QWidget *canvas) :
QwtPlotZoomer(canvas)
{
}


void MyZoomer::zoom( const QRectF & rect )
{
QRectF newRect;
const QRectF & baseRect = zoomBase();
newRect.setCoords( rect.left(), baseRect.top(), rect.right(), baseRect.bottom());
QwtPlotZoomer::zoom( newRect );
}

Uwe
8th May 2013, 09:27
Using setCurveFitter like in the code above is a horrible performance bottleneck - a heavy algo done in every repaint operation. Performance should much better without it.

Using the weeding algorithm is a good idea in general, but it has to be done outside of the render code ( not using it as curve fitter ). The basic idea is to calculate several datasets with different tolerances in advance and activate one of them depending on the range of the scales.

Uwe

slanina
8th May 2013, 11:51
I edited this post and removed my questions since I played a bit and found something that seems on track based on Uwe's comments.

----
Below code is my rudimentary Plot widget which shows 1 milion points and based on interval on scale I make new samples.
Zoomer I use is the one I posted previously (only x axis).

Now speed is quite nice :D

Is this on track?




MyPlot::MyPlot(QWidget *parent) :
QwtPlot(parent)
{

setTitle( "Sensor Demo" );
setCanvasBackground( Qt::white );
setAxisScale( QwtPlot::yLeft, 0.0, 80.0 );
insertLegend( new QwtLegend() );

QwtPlotGrid *grid = new QwtPlotGrid();
grid->attach( this );

QwtPlotCurve *curve;


qsrand(QDateTime::currentDateTime().currentMSecsSi nceEpoch());

resize( 600, 400 );

for(int j=0;j<8;++j)
{

qDebug() << " Setting curve: " << j;

curve = new QwtPlotCurve();
curve->setTitle( QString("Sensor %1").arg(j+1) );
curve->setPen( Qt::blue, 1 ),
curve->setRenderHint( QwtPlotItem::RenderAntialiased, true );


QPolygonF points;

for(int i=0;i<10000;++i)
{
QPointF p(i*10000, 10*j+qrand()%10);
points.append(p);
}

curve->setSamples( points );

qDebug() << " Added sample points";

curve->setCurveAttribute(QwtPlotCurve::Fitted, true);
curve->setPaintAttribute(QwtPlotCurve::ClipPolygons, true);

qDebug() << "Added curve fitter";

curve->attach( this );
myCurves.append(curve);

qDebug() << "Attached curve.";

}

MyZoomer *zoom = new MyZoomer(canvas());
zoom->setRubberBandPen(QPen(Qt::black, 2, Qt::DotLine));
zoom->setTrackerPen(QPen(Qt::black));

connect(this->axisWidget(QwtPlot::xBottom), SIGNAL(scaleDivChanged()), this, SLOT(handleScaleDivChanged()));
//connect(this->axisWidget(1), SIGNAL(scaleDivChanged()), this, SLOT(handleScaleDivChanged()));


show();

zoom->setZoomBase();


}

MyPlot::~MyPlot()
{
qDeleteAll(myCurves);
}

void MyPlot::handleScaleDivChanged()
{
qDebug() << "ScaleDivChanged";
const QwtScaleDiv div = axisScaleDiv(QwtPlot::xBottom);
qDebug() << "Interval: " << div.interval().minValue() << " - " << div.interval().maxValue();

double min = div.interval().minValue();
double max = div.interval().maxValue();
double factor = 1;

if( (max-min) > 10000) factor = 100;
if( (max-min) > 100000) factor = 1000;
if( (max-min) > 1000000) factor = 10000;

qDebug() << "Factor: " << factor;

for(int k=0;k<myCurves.size();++k)
{

qDebug() << "New Points curve: " << k;
QPolygonF points;

for(int i=min;i<=max;i=i+factor)
{
QPointF p(i, 10*k+qrand()%10);
points.append(p);
}

myCurves.at(k)->setSamples( points );

}
}

Uwe
8th May 2013, 13:38
2 comments:



antialiasing is expensive and IMHO doesn't make much sense with trillions of points
In case of recorded samples you could use weeding to create a new subset every time factor changes - not always when the scales are changing. Note that there is also a clipping algo active, that is pretty fast - fast enough to be done for every replot.



Uwe

slanina
8th May 2013, 13:46
2 comments:
In case of recorded samples you could use weeding to create a new subset every time factor changes - not always when the scales are changing. Note that there is also a clipping algo active, that is pretty fast - fast enough to be done for every replot.
Uwe

Can you elaborate more on this. I can get the scales changing from the signal and swap the point series. From changing scales I can calculate time factor change. So I don't quite understand when to use weeding algorithm - when to call it.

slanina
9th May 2013, 07:19
Below code is what I have come up with and it seems fine for now.
One problem I have noticed is that I hit memory limit of 32-bit Windows when I want to display 10 milion double points in 8 curves (80 milion total). My QList<QList<double> > crashes when being populated.
However zooming works.

My next step is to make a Plot that would show only up to 1 milion points at a single time and then user will have 2 buttons to scroll (or mouse events) in X axis time line per 100,000 point steps. Widget will load 3 milion points and then when limit is approaching it will start a thread that will load next points from the disk (backward operation supported). This way graph will only be a Window to the real data set. Also I can make a small sub-sampled mini-map of the whole data set having always fixed 1 milion or less points.



MyPlot::MyPlot(QWidget *parent) :
QwtPlot(parent)
{

setTitle( "Sensor Demo" );
setCanvasBackground( Qt::white );
setAxisScale( QwtPlot::yLeft, 0.0, 34816.0 );
insertLegend( new QwtLegend() );

QwtPlotGrid *grid = new QwtPlotGrid();
grid->attach( this );

QwtPlotCurve *curve;


qsrand(QDateTime::currentDateTime().currentMSecsSi nceEpoch());

resize( 600, 400 );


for(int j=0;j<8;++j)
{
qDebug() << "Making curve " << j ;
QList<double> list;
for(int i=0;i<MAX_POINTS;++i)
{
list.append(qrand()%2048);
}
masterList.append(list);
}

qDebug() << "Lists added....";

for(int j=0;j<8;++j)
{

qDebug() << " Setting curve: " << j;

curve = new QwtPlotCurve();
curve->setTitle( QString("Sensor %1").arg(j+1) );
curve->setPen( Qt::blue, 1 ),
curve->setRenderHint( QwtPlotItem::RenderAntialiased, true );


int newFact = 10000;
curve->setSamples( makeNewList(0,MAX_POINTS, j, newFact) );

qDebug() << " Added sample points";

if(newFact < 100)
{
curve->setCurveAttribute(QwtPlotCurve::Fitted, true);
}
curve->setPaintAttribute(QwtPlotCurve::ClipPolygons, true);

qDebug() << "Added curve fitter";

curve->attach( this );
myCurves.append(curve);

qDebug() << "Attached curve.";

}

MyZoomer *zoom = new MyZoomer(canvas());
zoom->setRubberBandPen(QPen(Qt::black, 2, Qt::DotLine));
zoom->setTrackerPen(QPen(Qt::black));

connect(this->axisWidget(QwtPlot::xBottom), SIGNAL(scaleDivChanged()), this, SLOT(handleScaleDivChanged()));
//connect(this->axisWidget(1), SIGNAL(scaleDivChanged()), this, SLOT(handleScaleDivChanged()));


show();

zoom->setZoomBase();

// Uncomment this to preselect from 0 to 10000;
//QwtScaleDiv sc(0, 10000);
//this->setAxisScaleDiv(QwtPlot::xBottom, sc);


}

MyPlot::~MyPlot()
{
qDeleteAll(myCurves);
}


QPolygonF MyPlot::makeNewList(double min, double max, int curve, int& newFact)
{
QPolygonF points;

double factor = 1;

double limit1 = (max-min) / 10.0;
double limit2 = limit1 * 10;
double limit3 = limit2 * 10;
double limit4 = limit3 * 10;

if( (max-min) > limit1) factor = 10;
if( (max-min) > limit2) factor = 100;
if( (max-min) > limit3) factor = 1000;
if( (max-min) > limit4) factor = 10000;

//qDebug() << "Factor: " << factor;

for(int i = min; i<=max;i=i+factor)
{
//qDebug() << "point: " << i;
QPointF p(i, 2048+4096*curve+getPeak(i, i+factor-1, curve));
points.append(p);
}

newFact = factor;

return points;
}

double MyPlot::getPeak(int from, int to, int curve)
{
double res = 0;

from = from < masterList.at(curve).size() ? from : masterList.at(curve).size()-1;
to = to < masterList.at(curve).size() ? to : masterList.at(curve).size()-1;

//qDebug() << "From: " << from << " To: " << to;

res = masterList.at(curve)[from];
for(int i=from; i<= to; ++i)
{
if(masterList.at(curve)[i]> res)
{
res = masterList.at(curve)[i];
}
}
return res;
}

void MyPlot::handleScaleDivChanged()
{
//qDebug() << "ScaleDivChanged";
const QwtScaleDiv div = axisScaleDiv(QwtPlot::xBottom);
//qDebug() << "Interval: " << div.interval().minValue() << " - " << div.interval().maxValue();

double min = div.interval().minValue();
double max = div.interval().maxValue();

int newFact = 10000;

for(int k=0;k<myCurves.size();++k)
{
//qDebug() << "New Points curve: " << k;
myCurves.at(k)->setSamples( makeNewList(min,max, k, newFact) );
if(newFact < 100)
{
myCurves.at(k)->setCurveAttribute(QwtPlotCurve::Fitted, true);
}
else
{
myCurves.at(k)->setCurveAttribute(QwtPlotCurve::Fitted, false);
}

}
}

Uwe
9th May 2013, 11:55
Have a look at QwtSeriesData and try to understand its idea. Then enable the QwtPlotItem::ScaleInterest flag and implement QwtSeriesData::setRectOfInterest(). Also overload the QwtSeriesData::boundingRect operation - I'm sure in your case you will find a faster implementation without having to iterate over all points. ( maybe you want to have the bounding rectangle of all points - not only those in memory ).

Maybe it is interesting to check the sinusplot.cpp example - it is not comparable to your situation, but illustrates that you can calculate samples on thy fly. F.e. in your example code it would not be necessary to waste memory for the x coordinates as it could be calculated on the fly from the index.

I also recommend to use Qwt 6.1. It reintroduces a couple of interesting performance optimizations, that are available in Qwt5, but are missing in Qwt 6.0. F.e in your case you will have many duplicates - points that are mapped to the same widget coordinate - something where QwtPlotCurve::FilterPoints helps. Also the weeding algorithm has the option to split the points in chunks, what drastically improves its performance when being used for many points ( the algo is not linear increasing ). You also find qwtUpperSampleIndex, that will help to implement something like in the curvetracker example.

Uwe

slanina
9th May 2013, 12:30
Have a look at QwtSeriesData and try to understand its idea. Then enable the QwtPlotItem::ScaleInterest flag and implement QwtSeriesData::setRectOfInterest(). Also overload the QwtSeriesData::boundingRect operation - I'm sure in your case you will find a faster implementation without having to iterate over all points. ( maybe you want to have the bounding rectangle of all points - not only those in memory ).

Maybe it is interesting to check the sinusplot.cpp example - it is not comparable to your situation, but illustrates that you can calculate samples on thy fly. F.e. in your example code it would not be necessary to waste memory for the x coordinates as it could be calculated on the fly from the index.

I also recommend to use Qwt 6.1. It reintroduces a couple of interesting performance optimizations, that are available in Qwt5, but are missing in Qwt 6.0. F.e in your case you will have many duplicates - points that are mapped to the same widget coordinate - something where QwtPlotCurve::FilterPoints helps. Also the weeding algorithm has the option to split the points in chunks, what drastically improves its performance when being used for many points ( the algo is not linear increasing ). You also find qwtUpperSampleIndex, that will help to implement something like in the curvetracker example.

Uwe

I use 6.1 Qwt downloaded by git and compiled yesterday.

I was thinking like following:

1. Load a fairly large amount of data into the memory.
2. Make high resolution data set as by using getPeak() method.
3. Display lowest resolution graph first.
4. As user zooms in, replace that lowest resolution graph with computed higher resolution graph but only in that section user zoomed in by using scale Interval (from - to on X axis). Use factors to make zoomed-in graph also coarse if many points are displayed to gain speed.
5. In case my dataset can not fit into the memory, load from file as much as it can, remember file offset and when the user scrolls near the end of currently displayed dataset, load additional points in thread to keep GUI responsive and add them instead of currently displayed ones.

I am keeping X coordinate because I want to keep the X axis correct. Meaning, if from 1 milion points I make high resolution graph by taking only 1 point in 1000 I will get 1 thousand points. If I make a list of 1000 points where index goes from 0 to 999, Qwt will display on X axis first point as 0 and last one as 999. But I would want first one to be 0 and last one 1 milion.

I will check QwtSeriesData (I have used it last time on 1 program I made 1.5 years ago) where I also did a charting plot app based on cpuplot. But based on above explained idea, would I gain anything?

Uwe
9th May 2013, 18:34
2. Make high resolution data set as by using getPeak() method.
The other way round: make low resolution datasets using weeding with different tolerances.



I will check QwtSeriesData (I have used it last time on 1 program I made 1.5 years ago) where I also did a charting plot app based on cpuplot. But based on above explained idea, would I gain anything?
YourSeriesData::setRectOfInterset is the intended place where to implement all your ideas from above. The "rectangle of interest" always corresponds to the currently displayed ranges of the scales.

Uwe

slanina
14th May 2013, 09:03
I did implement my own Series class so:


Plot widget:



#include "myseriesplot.h"
#include "myseries.h"
#include "myzoomer.h"
#include <qwt_plot_grid.h>
#include <qwt_legend.h>

MySeriesPlot::MySeriesPlot(QWidget *parent) :
QwtPlot(parent)
{

series = new MySeries();

setTitle( "Sensor Demo" );
setCanvasBackground( Qt::white );
setAxisScale( QwtPlot::yLeft, 0.0, series->maxYVal());
setAxisScale( QwtPlot::xBottom, 0.0, series->maxXVal());

insertLegend( new QwtLegend() );

QwtPlotGrid *grid = new QwtPlotGrid();
grid->attach( this );

curve = new QwtPlotCurve();
curve->setTitle( QString("Sensor 1"));
curve->setPen( Qt::blue, 1 ),
curve->setRenderHint( QwtPlotItem::RenderAntialiased, false ); // not antialiasing - better speed.

curve->setData(series);

curve->setPaintAttribute(QwtPlotCurve::ClipPolygons, true);

curve->attach( this );


MyZoomer *zoom = new MyZoomer(canvas());
zoom->setRubberBandPen(QPen(Qt::black, 2, Qt::DotLine));
zoom->setTrackerPen(QPen(Qt::black));

resize( 600, 400 );

show();

zoom->setZoomBase();

}

MySeriesPlot::~MySeriesPlot()
{
if(curve) delete curve; // will delete series internally - takes ownership!

}


Series header file:



#ifndef MYSERIES_H
#define MYSERIES_H

#include <qwt_series_data.h>
#include <QList>


class MySeries : public QwtSeriesData<QPointF>
{
public:
MySeries();
virtual ~MySeries();

size_t size() const;
QPointF sample(size_t i) const;
QRectF boundingRect() const;
void setRectOfInterest( const QRectF &rect );
double maxYVal() const { return d_maxYVal;}
double maxXVal() const;

private:

quint16 getPeak(int from, int to);

QList<QList<QPointF>* > *data;
int currList;

double d_maxYVal;

double maxSubSamples;
};

#endif // MYSERIES_H


Series .cpp file:


#include "myseries.h"
#include <QDebug>
#include <QDateTime>
#include <QPointF>

#define MAX_POINTS 1000000

MySeries::MySeries() : currList(0), maxSubSamples(10)
{

data = new QList<QList<QPointF>*>();

qsrand(QDateTime::currentDateTime().currentMSecsSi nceEpoch());

d_maxYVal = 0;

QList<QPointF> *l = new QList<QPointF>();

for(int i=0;i<MAX_POINTS;++i)
{
int v = qrand()%2048;
l->append(QPointF(i,v));
if(v > d_maxYVal) d_maxYVal = v;
}

data->append(l);


for(int i=1;i<maxSubSamples;++i)
{
QList<QPointF> *l = new QList<QPointF>();
double limit = (data->at(0)->size()) / (2.0 * i);
limit = data->at(0)->size() / limit;
qDebug() << "SubSample: " << i << " Limit: "<< limit;

for(int i = 0; i<=data->at(0)->size();i=i+limit)
{
if(i<data->at(0)->size()-1)
{
double x = data->at(0)->at(i).x();
double y = getPeak(i, i+limit-1);
l->append(QPointF(x,y));
}
}

data->append(l);

qDebug() << "\tPoints:" << l->size();
}



}

quint16 MySeries::getPeak(int from, int to)
{
QPointF res;

from = from < data->at(0)->size()-1 ? from : data->at(0)->size()-1;
to = to < data->at(0)->size()-1 ? to : data->at(0)->size()-1;

//qDebug() << " From: " << from << " To:" << to;

res = data->at(0)->at(from);
for(int i=from; i<= to; ++i)
{
if(data->at(0)->at(i).y() > res.y())
{
res = data->at(0)->at(i);
}
}
return static_cast<quint16>(res.y());
}


MySeries::~MySeries()
{
if(data)
{
qDeleteAll(*data);
delete data;
}
}

size_t MySeries::size() const
{
if(data)
{
return data->at(currList)->size();
}
else
{
return 0;
}
}

QPointF MySeries::sample(size_t i) const
{
if(data)
{
return data->at(currList)->at(i);
}
else
{
return QPointF(i,0);
}
}

QRectF MySeries::boundingRect() const
{
return d_boundingRect;
}

double MySeries::maxXVal() const
{
if(data)
{
return data->at(0)->at(data->at(0)->size()-1).x();
}
else
{
return 0;
}
}



void MySeries::setRectOfInterest( const QRectF &rect )
{
//qDebug() << "Rect of interest:";
//qDebug() << "TopLeft: " << rect.topLeft();
//qDebug() << "TopRight: "<< rect.topRight();
//qDebug() << "BottomLeft: " << rect.bottomLeft();
//qDebug() << "BottomRight: "<< rect.bottomRight();

double spread = rect.bottomRight().x() - rect.bottomLeft().x();
spread = spread > 0 ? spread : spread * -1;

double limit = ((data->at(0)->size() * 0.50) / maxSubSamples);

int curr = maxSubSamples-1;
for(int i=0;i<maxSubSamples;++i)
{
double z = limit * (10-i);
if(spread >= z)
{
qDebug() << " *** MEETS Z limit: " << z;
break;
}
else
{
qDebug() << " DOES NOT MEET Z limit: " << z;
--curr;
}
}

if(curr < 0)
{
qDebug() << "Making curr 0";
curr = 0;
}
currList = curr;

qDebug() << "Spread: " << spread;
qDebug() << "Curr List: " << currList;

d_boundingRect = rect;
}


Much more clean implementation - thanks Uwe.

Now in Series data I can implement caching and all this stuff to load next points when user scrolls near the limit of the currently loaded list...Initial loading and making sub-sample lists should be done in a separate thread to keep GUI responsive (some progress bar or something so user knows something is being done in the background). Loading on demand should also be done on internal thread for the same reason but user won't notice if limit when to start loading is cleverly assigned.

Scrolling can be done by setting x scale to be some number say from 0 to 1000 and then user can scroll...as it is implemented now, when data is loaded all values are shown on X axis (from min to max).

Tweaking on when to show more detailed graph or not based on zoom factor can be done more. Now it is based on some crude formula I have developed.