PDA

View Full Version : QWidget::update() efficiency with hierarchy of custom widgets



redBeard
26th January 2011, 20:53
I have a real-time data collection/graphing application that is chewing up what seems like an inordinate amount of CPU time in the graphics infrastructure.

The application consists of an instrument panel (that extends QMainWindow). I use a QWidget as the central widget.

When the data collection starts I start a timer to update the display, as in:


panelUpdateTimer = new QTimer
panelUpdateTimer->setInterval(30);
connect(panelUpdateTimer, SIGNAL(timeout()), centralWidget, SLOT(repaint()));

30 ms refresh rate gives me 33.3 frames/sec. The QTimer overhead adds about 5% CPU load. not bad....

The central widget is populated with a collection of custom instruments, each of which extend QWidget. I use a QBoxLayout in the central widget to manage the instruments.

Each instrument also includes a collection of custom QWidgets (axis, title, legend, graphic, etc.). I also use QBoxLayout to manage the layout of those little widgets.

The centralWidget.paintEvent() method:


PANEL_ELEMENT_LIST::const_iterator itr;
for (itr = panelElements.begin(); itr != panelElements.end(); ++itr) {
(*itr)->update();
}

Pretty simple.

The Instrument.paintEvent() method:


if (graphicPanel != NULL) {
graphicPanel->update();
}

Again, very simple. The graphicPanel.paintEvent() draws a QPainterPath (with about 20 lines in it) and a sequence of more lines (e.g., an oscilloscope). I cache the lines to paint in a QLines array and use QPainter::drawLines() to draw them. Pretty efficient.

Now, the problem....

Just starting the QTimer to call the centralWidget::update() (which calls the centralWidget.paintEvent() method) adds about 5% CPU for a 30 ms refresh interval. not bad.

If that centralWidget.paintEvent() method calls each of the Instrument::update() methods (which wind their way to the Instrument.paintEvent() method, that adds a negligible amount of CPU time (1 - 2 % for 3 instruments).

However, just having the Instrument.paintEvent() method call the graphic::update() method (of its child) without the graphic.paintEvent() method even do anything causes the CPU utilization to spike to about 50% !. Adding the actual drawing of the lines is negligible.

I understand that I'm posting an QWidget.update() whist inside a QWidget.paintEvent(), but 50% CPU utilization seems excessive. If I don't call the graphic::update() method, the CPU is about 15% utilization.

That doesn't seem scalable.

Ideas?

redBeard
27th January 2011, 22:22
Well, I found some issues:

1) I wasn't dealing with the background of the various widgets efficiently. According to the QWidget documentation:


To rapidly update custom widgets with simple background colors, such as real-time plotting or graphing widgets, it is better to define a suitable background color (using setBackgroundRole() with the QPalette::Window role), set the autoFillBackground property, and only implement the necessary drawing functionality in the widget's paintEvent().
I had been filling the background in my paintEvent() method in two of the widgets. This helped quite a bit. Utilization dropped by 50% (to 25%).

1) QT_GRAPHICSSYSTEM=xxxx. I tried the various 'native', 'raster', 'opengl' settings. On Linux the 'raster' was the best performer. The CPU utilization dropped to about 15% in my test. On windows 'opengl' just caused a blank window screen to appear.... 'raster' was the fastest.

On Windows, interestingly enough, with just my data simulator running (sending a data message every 15 ms down the data flow), the application was not registering any CPU utilization according to the Windows Task Manager. On Linux, 'top' showed it taking about 5%.

Overall, the Windows Task manager was showing much lower CPU utilization of my app on windows than 'top' was showing on Linux. Not sure if the measurement tools are vastly different or?

continue trudging....

SixDegrees
27th January 2011, 22:43
Task Manager is worthless for real-time CPU monitoring.

Also, CPU usage is a tricky metric. Is there an actual performance problem in your application? If not, measuring CPU activity isn't going to tell you much; it's generally reported as some sort of average over a time interval, so if the interval is long enough the percentage will drop, while if the interval is made too short it will report 100% activity. top can be extremely misleading in this regard.

Unless your program is utilizing 100% of the CPU for extended periods of time, I wouldn't pay any attention to this metric, particularly if the UI is responsive and updates are taking place satisfactorily.

redBeard
27th January 2011, 23:09
Task Manager is worthless for real-time CPU monitoring.
Yeah, that's what I figgered....


Also, CPU usage is a tricky metric. Is there an actual performance problem in your application? If not, measuring CPU activity isn't going to tell you much; it's generally reported as some sort of average over a time interval, so if the interval is long enough the percentage will drop, while if the interval is made too short it will report 100% activity. top can be extremely misleading in this regard.

Unless your program is utilizing 100% of the CPU for extended periods of time, I wouldn't pay any attention to this metric, particularly if the UI is responsive and updates are taking place satisfactorily.
Is there a problem? Well, my end application will probably have 10 - 20 instrument displays, some using basic QPainter primitives and some using OpenGL primitives. I was sort of shocked to see 50% utilization with only 3 simple instruments and no real other processing (my prototype does no DSP work - filtering, FFTs, etc. - just displaying raw signal data).

I was concerned that what I am doing will not scale well.

I have seen other applications like I'm writing with 10-20 display instruments and fairly extensive (50 or so) data transform elements take 1 - 2 % CPU. Impressive...

wysota
29th January 2011, 12:38
I will say what I'm usually saying in such cases. Do you really need to update 30 times per second? Plotting real-time data doesn't mean that plotting has to be real-time as well. Your eye won't manage to register any details of the plot at such frequency. Try dropping the framerate to 20Hz or even 16Hz and you should be fine. You can also have an adaptive timer that will increase frequency when more CPU power is available and reduce it when the load rises. This should make your app scale well for a different number of data sources.

Also consider setting some flags for your widget, you should be able to optimize things a bit more.

redBeard
29th January 2011, 19:27
I will say what I'm usually saying in such cases. Do you really need to update 30 times per second? Plotting real-time data doesn't mean that plotting has to be real-time as well. Your eye won't manage to register any details of the plot at such frequency. Try dropping the framerate to 20Hz or even 16Hz and you should be fine.
You are correct here. We have tuned the refresh rate accordingly. I do have one instrument display, however, that is rather choppy looking at 16 - 20 frames/sec. It looks best at 25 - 30 frames/sec.

You can also have an adaptive timer that will increase frequency when more CPU power is available and reduce it when the load rises. This should make your app scale well for a different number of data sources.
I like that idea a lot.

Also consider setting some flags for your widget, you should be able to optimize things a bit more.
QWidget flags? Which ones?

I also found another CPU hog with my display code, as well. I have a nice looking linear gradient for background of the main instrument panel, using QLinearGradient, the parameters of which are set in the central widget resize() method. That's what our screen designer wanted..... ;)

I found that no background reduced the CPU utilization dramatically - to about 10% on Linux when the overall instrument panel was maximized to full screen. That appears to be approaching QTimer overhead of the two timers I have. I'll see if i can use a static image instead for this and position it in the background according to the size of the overall instrument panel.

wysota
29th January 2011, 19:54
You are correct here. We have tuned the refresh rate accordingly. I do have one instrument display, however, that is rather choppy looking at 16 - 20 frames/sec. It looks best at 25 - 30 frames/sec.
Are you reconstructing the path from scratch at every update?


QWidget flags? Which ones?
For example Qt::WA_OpaquePaintEvent and Qt::WA_StaticContents


I also found another CPU hog with my display code, as well. I have a nice looking linear gradient for background of the main instrument panel, using QLinearGradient, the parameters of which are set in the central widget resize() method. That's what our screen designer wanted..... ;)

I found that no background reduced the CPU utilization dramatically - to about 10% on Linux when the overall instrument panel was maximized to full screen. That appears to be approaching QTimer overhead of the two timers I have. I'll see if i can use a static image instead for this and position it in the background according to the size of the overall instrument panel.
Cache the background.

redBeard
29th January 2011, 22:20
Are you reconstructing the path from scratch at every update?
No. For the oscilloscope-type displays I have a QPainterPath of the range n domain axis lines which is only calculated when the display size changes. The actual data lines are an array of QLines. This is filled as the data arrives from an upstream flow element. The paintEvent() method just draws the one QPainterPath of axis lines and the drawLines() of the lines array (and, of course, setting the correct pen color).

The jumpy display in question is an OpenGL-based graphic displaying FFT data (e.g., frequency analyzer). The vertices data never changes from refresh to refresh, just the colors. So I build the vertices buffer when the screen size changes. I cache the collection of color vectors and only recalculate the colors for new data when it arrives and 'memcopy' the old data to the new locations.

Each refresh just consists of a single OpenGL call (glDrawElements()).

For example Qt::WA_OpaquePaintEvent and Qt::WA_StaticContents
I'll look at those.

Cache the background.
Here's the code for the background linear gradient:


centralWidgetConstructor() {
QLinearGradient backgroundGradient;
backgroundGradient.setStart(), setFinalStop(), setColor()......

setBackgroundRole (QPalette::Window);
QPalette p (palette());
p.setBrush (QPalette::Window, QBrush (backgroundGradient));
setPalette(p);
}
centralWidget::resize() {
backgroundGradient.setFinalStop(width()/2, 0);
QPalette p (palette());
p.setBrush (QPalette::Window, QBrush (backgroundGradient));
setPalette(p);
}

The QLinearGradient is calculated when the screen size changes (which is not very often, typically when the display is first rendered. Users only rarely change the panel size).

SixDegrees
29th January 2011, 22:33
Yeah, that's what I figgered....


Is there a problem? Well, my end application will probably have 10 - 20 instrument displays, some using basic QPainter primitives and some using OpenGL primitives. I was sort of shocked to see 50% utilization with only 3 simple instruments and no real other processing (my prototype does no DSP work - filtering, FFTs, etc. - just displaying raw signal data).

I was concerned that what I am doing will not scale well.

I have seen other applications like I'm writing with 10-20 display instruments and fairly extensive (50 or so) data transform elements take 1 - 2 % CPU. Impressive...

Don't you think there's something wrong if your computer is only using the CPU 2% of the time?

In real life, the CPU is working at 100% capacity very nearly 100% of the time. How that usage is allocated to a particular program, over a particular period of time, varies so widely as to be meaningless.

I'm usually much more concerned that a program is running correctly when it is initially developed; if needed, optimization can be applied later at the points where it is required.

wysota
29th January 2011, 22:36
No. For the oscilloscope-type displays I have a QPainterPath of the range n domain axis lines which is only calculated when the display size changes. The actual data lines are an array of QLines. This is filled as the data arrives from an upstream flow element. The paintEvent() method just draws the one QPainterPath of axis lines and the drawLines() of the lines array (and, of course, setting the correct pen color).
Can you show us some code? I can immediately suggest an optimisation - if you append data to the path then you can cache parts of the path as pixmaps and only append the new chunks to the pixmap(s). Then drawing the plot boils down to rendering one or more pixmaps.


The QLinearGradient is calculated when the screen size changes (which is not very often, typically when the display is first rendered. Users only rarely change the panel size).
This doesn't do any graphics calculations or even colour calculations for the gradient. QLinearGradient is strictly a data carrier, nothing more. Your gradient is recalculated and redrawn upon every paint event. It's much better to render it once to a pixmap and then just blit the pixmap to the widget (with opaque paint event attribute set). In the end of all optimisations your paint event should be reduced to drawing at most three layers of pixmaps - the background, the plot and the foreground (grid). If you can combine the background and the foreground and draw the plot over the foreground, that's even better. Notice that with a GL paint engine this will be done by blitting three textures directly from the card's memory so it should be very fast.

redBeard
30th January 2011, 14:32
I'm usually much more concerned that a program is running correctly when it is initially developed; if needed, optimization can be applied later at the points where it is required.
Obviously correctness is paramount. I only optimize after verifying correctness of the software.

However, I have a large number of elements to develop, creating a good pattern now whilst in prototyping stage will allow me to develop these elements without having to go back (much) and optimize a large code base.