PDA

View Full Version : QtConcurrent::blocking* thread count



ibex
25th January 2011, 16:55
As far as I can see from the source, if you call QtConcurrent::blockingMap it will block but not use the current thread. If the current thread was allocated by QThreadPool this means cpu utilisation will be reduced.

Is there a reason for this behaviour or any way to avoid it?

I would like to launch a number of concurrent tasks, which may themselves include sections of code parallelised with QtConcurrent. If there are insufficient threads available, these sections should run serially but ideally all cores would be fully used.

Any ideas or recommendations?

thanks.

wysota
25th January 2011, 17:05
Is there a reason for this behaviour
The reason is if the current thread was taken into account when calculating the threadpool, on unicofe systems the threadpool would be completely useless (it would contain 0 threads).

or any way to avoid it?
There is nothing to avoid. Since your "current" thread will be blocked, it will not use up resources.


I would like to launch a number of concurrent tasks, which may themselves include sections of code parallelised with QtConcurrent. If there are insufficient threads available, these sections should run serially but ideally all cores would be fully used.

Any ideas or recommendations?
I'd start with implementing a scheduler for the OS that fully and properly uses all computing power available. So far there is no such scheduler available in any mainstream OS kernel.

And skipping the buzzwords - don't worry, unless you cause a deadlock yourself, the threadpool will use the number of threads equal to the number of available processing units on your system.

ibex
25th January 2011, 17:18
I was thinking it could use the current thread and any more available from the thread pool. Blocking the current thread will not use up resources but presumably will be counted by QTheadPool::activeThreadCount? eg. if you had 4 logical cores, only 3 would be used by the map and the other would be idle.

ps. what is a "unicofe system"?

wysota
25th January 2011, 20:31
I was thinking it could use the current thread and any more available from the thread pool.
A thread consists of a stack and code segments. Don't confuse it with core (or thread of execution). You can't "reuse" a thread while it is blocked. If it is blocked then it is blocked - you can spawn a new thread that will do something else.


Blocking the current thread will not use up resources but presumably will be counted by QTheadPool::activeThreadCount?
If it wasn't spawned by QtConcurrent then no, it won't be counted. The method counts threads in the thread pool and not all threads in the application. You can have 10 distinct threadpools and activeThreadCount will only count the ones in the pool it belongs to.


eg. if you had 4 logical cores, only 3 would be used by the map and the other would be idle.
If you have 4 cores then the default thread pool can simoultaneously perform 4 jobs. It has nothing to do with the number of threads in your application.



ps. what is a "unicofe system"?
That's a typo, should be "unicore".

ibex
26th January 2011, 11:18
You can't "reuse" a thread while it is blocked
I appreciate that. Rather than blocking the calling thread you could use it to perform some of the computation.


If you have 4 cores then the default thread pool can simoultaneously perform 4 jobs. It has nothing to do with the number of threads in your application.
The point being in my example there was one (global) thread pool which had spawned the caller of QtConcurrent::map. I don't think there is any mechanism for map and the pool to determine that one of the pool spawned threads is blocked and therefore all cpu resources won't be used. On a unicore machine the situation is worse. maxThreadCount would be 1, but 1 thread would also be in use so the map's thread request would be queued indefinitely.

I think a solution for me would be to increment the pool's maxThreadCount immediately before calling map and decrement it immediately after, since I know the origin of the calling thread. Hopefully this will be ok since "All functions in this class are thread-safe".

wysota
26th January 2011, 12:13
Rather than blocking the calling thread you could use it to perform some of the computation.
That's what you should do - you schedule a bunch of jobs and continue with your normal operations. Then at some point when the results are ready you query for them and use them.


The point being in my example there was one (global) thread pool which had spawned the caller of QtConcurrent::map. I don't think there is any mechanism for map and the pool to determine that one of the pool spawned threads is blocked and therefore all cpu resources won't be used.
I think you still fail to get the picture. The pool contains 4 threads. You schedule some job so a thread is spawned reducing the available number of threads to 3. Then you schedule another job and since three threads from the pool are available at the moment, 3 will be used. When the first thread finishes execution, QtConcurrent will use it for the other operation if some work still needs to be done. This all has nothing to do with all other possible threads in your application.


On a unicore machine the situation is worse. maxThreadCount would be 1, but 1 thread would also be in use so the map's thread request would be queued indefinitely.
This is only the case if you schedule a blocking job from within a job already performed on the thread pool. Like so:

void func() {
QList<QImage> imgs = makeList();
QtConcurrent::blockingMapped(imgs, someFunc);
}
// ...
QtConcurrent::run(func);
This is why you have QThreadPool::releaseThread():

void func() {
QList<QImage> imgs = makeList();
QThreadPool::globalInstance()->releaseThread();
QtConcurrent::blockingMapped(imgs, someFunc);
QThreadPool::globalInstance()->reserveThread();
}
// ...
QtConcurrent::run(func);

ibex
26th January 2011, 12:48
I think you still fail to get the picture. The pool contains 4 threads. You schedule some job so a thread is spawned reducing the available number of threads to 3. Then you schedule another job and since three threads from the pool are available at the moment, 3 will be used. When the first thread finishes execution, QtConcurrent will use it for the other operation if some work still needs to be done. This all has nothing to do with all other possible threads in your application.

Yes, we seem to be thinking along different lines. My current design is like this:
a) 1 master thread to handle the event loop and drawing
b) Possibly multiple computation jobs that can be launched concurrently in separate threads, but most commonly only 1.
c) The possibility of improving the performance of the jobs by parallelising critical sections (ie. QtConcurrent::blockingMap instead of OpenMP)
All threads except for the main one would be managed by a single thread pool and these would be the only intensively used ones.


This is why you have QThreadPool::releaseThread()
Thanks, this is a better alternative than my last suggestion and solves my problem I think :)
I should have read the documentation more carefully...