PDA

View Full Version : Why is QtConcurrent so slow?



ArlexBee-871RBO
27th December 2009, 20:58
Greetings.

I'm very new to QtConcurrent, and I have very little experience in threading in general. I went over some examples today in the Qt docs, and I can't figure out why my multi-threaded code runs slower than the single-threaded version.

This code is basically a modified version of what I found in the docs.



#include <QThread>
#include <QApplication>
#include <qtconcurrentmap.h>
#include <iostream>
#include <QtConcurrentRun>
using namespace std;

int Func(const int &n){

//Some random calculations...
int a = rand() % 3, b = rand() % 4, c = rand() % 5;
int x = a * a, y = b * b, z = c * c;
int u = a + x, v = b + y, w = c + z;
int d = u - v * w;
if( d > 0 ){
d = a / x / u;
}else{
d = b / y / v;
}
for(int i = 1; i < 10; ++i){
for(int j = 1; j < 30; ++j){
d = rand() % 5 + 3;
d /= 2;
d *= rand() % 4;
}
}
return d * n;
}

int main(int argc, char *argv[]){

QApplication app(argc, argv);
const int SIZE = 500000;
QVector<int> data;

for(int i = 0; i < SIZE; ++i){
data.push_back(i);
}

//non-concurrent version
//for(int i = 0; i < SIZE; ++i){ data[i] = Func( data[i] ); }

//concurrent version
QtConcurrent::mapped(data, Func);

return 0;
}



I'm running it on a dual core, and the single-threaded version runs in about 0.7s, but the using QtConcurrent it takes over 30.0s. The two cores are utilized, but not at 100%.

Also, using QtConcurrent::map() I get seg-fault, and I'm not sure why.

I'm probably doing something wrong, but can anyone tell me why QtConcurrent::mapped() call slows it down so much??

Tanuki-no Torigava
27th December 2009, 22:56
Look here (http://codejourneys.blogspot.com/2008/06/qt-simple-example-of-use-of.html). Pretty good example.
Also, utilization of CPU depends on QThread::idealThreadCount(). Usually it is something like 1 thread per core in order to not overload the OS with your task.

And you have various errors in your app regarding QtConcurrent use. I suggest you to study the examples bundled within Qt distribution. Sure that helps.

numbat
28th December 2009, 06:46
if( d > 0 ){
d = a / x / u;
}else{
d = b / y / v;
}

These are divide by zero errors in your code. After commenting out these lines and testing, the mapped function takes about five times as long (as the direct implementation) on my system!

wysota
28th December 2009, 12:10
Are you sure you have been testing with the exact code you pasted? It shouldn't work at all as QtConcurrent::mapped() is a non-blocking function so returning 0 immediately should cause your program to quit (and my tests seem to cofirm that).

ArlexBee-871RBO
28th December 2009, 18:49
wysota, yes I've been testing with the same exact code. It does run and I started getting errors when I raised the SIZE. I fixed the errors by replacing the last three lines with this:


QFuture<void> res = QtConcurrent::map(data, Func);
res.waitForFinished();

return 0;

I do monitor it during run-time with top, and both cores are used, but they never rise above 50%. They usually hover around 30-40%. The single-threaded version runs on one core, but at 100%.

numbat, I removed the if-else block just so there are no errors in the code, but it still runs several times slower than the single-threaded version. About ten times slower to be exact.

Tanuki-no, QThread::idealThreadCount() returns 2 for my system, which is correct.

ArlexBee-871RBO
28th December 2009, 19:01
I just tested a simple pthread program that creates two threads. I get similar results with it as I did with Qt. The single-threaded version runs about 10 times faster than the multi-threaded version. Does this mean there is something wrong with my system?

wysota
28th December 2009, 19:46
What does QThread::idealThreadCount() return?

ArlexBee-871RBO
28th December 2009, 19:54
What does QThread::idealThreadCount() return?

It returns 2, and I'm running Linux on a dual-core Intel.

There is some kind of system overhead when running more than one thread, and that causes multi-threaded version of all examples I've tried to run drastically slower than the single-threaded version. I just can't figure out what it is.

wysota
28th December 2009, 19:59
Is your system doing anything heavy in the background? Does your system have free memory or is it swapping?

ArlexBee-871RBO
28th December 2009, 21:46
Is your system doing anything heavy in the background? Does your system have free memory or is it swapping?

No, my system isn't doing anything heavy, and I have more than enough ram on it. 2GB total, and i'm not running anything else that would exhaust the CPU or memory.

Question: Does the multi-threaded version (using QtConcurrent::map or mapped) run faster than the single-threaded version on your system??

numbat
29th December 2009, 11:37
rand() is not thread-safe*. Changing to rand_r and re-running, the multi-threaded version takes slightly less real time, although about 50% extra user time, so there is reasonable overhead using QtConcurrent for very short tasks.

*Or maybe rand is thread safe (http://evanjones.ca/random-thread-safe.html) and the extra time is waiting for a lock on the state information kept by rand.