PDA

View Full Version : QMutex seems not to lock()



sylvaticus
2nd July 2009, 12:42
Hello.. I am in the progress of parallelise a multi-agent model where "agents" are different instances (thousands) of a class and they have to independently perform a relatively cpu-intensive task (few seconds).
Looks the perfect situation for a reentrant code :) .... it seems... :crying:
My problem is that, even if I mutex.lock() all the should-be-concurrent code (for debugging reasons of course) my application crash, while if I use a single thread it works.
This is the code that I implemented so far:


void
Manager_farmers::landAllocation(){
// .. serial code ..

for (int i=0; i < nThreads; i++){
Manager_farmers_threads* wThread = new Manager_farmers_threads;
wThreads.push_back(wThread);
}

// parallelise the code to z workingThreads..
for (uint y=0;y<myAgents.size();y++){
bool assigned = false;
while(!assigned) {
for (uint z=0;z<wThreads.size();z++){
if (!wThreads[z]->isRunning()){
wThreads[z]->assignJob(myAgents[y], freePlot);
wThreads[z]->start();
assigned = true;
break;
}
}
}
}

// be sure that all threads are ended..
for (int z=0; z < nThreads; z++){
wThreads[z]->wait();
}

// ..continue with serial code...

for (int i=0; i < nThreads; i++){
delete wThreads[i]; // deleting the threads
}
}

void
Manager_farmers_threads::assignJob(Agent_space* agent_h, const Pixel* plot_h){
// .. this code shuld be run serialised by the main thread..
agent = agent_h;
plot = plot_h;
agent->cachedOffer = 0;
}

void
Manager_farmers_threads::run(){
// This code should be run parallelised...
QMutex localMutex;
localMutex.lock();
agent->cachedOffer = agent->offerRentalPrice(plot);
localMutex.unlock();
}

The question is.. why for hell it is crashing even if all the code is locked ? :confused::confused:

wysota
2nd July 2009, 13:09
You are creating a local mutex so each entry to the method creates a private instance of the mutex so each thread works on its own instance of the mutex so there is never a situation when two threads try to lock the same mutex. You have to work on a single mutex for all threads.

sylvaticus
2nd July 2009, 13:38
Hi.. thanks for the answer.
In fact I originally tried to use the QMutex as a member of the thread class, like it is done in the book "C++ GUI programming with Qt4" ("Syncronizing threads").

The thread class looks now like this:


class Manager_farmers_threads : public QThread {
Q_OBJECT

public:
Manager_farmers_threads();
void assignJob(Agent_space* agent_h, const Pixel*plot_h);

protected:
void run();

private:
volatile Agent_space* agent;
const Pixel* plot;
QMutex mutex;
};


However it still doesn't work (and it should't, as each thread would still have it's own mutex istance).

If I understood correctly, you suggest to create the Mutex somewhere in the main thread and then giving pointers to the threads? Or create a function in the main thread to lock/unlock the mutex and call it from the threads?

wysota
2nd July 2009, 13:56
This changes nothing. The mutex has to be static or global - different instances of the thread have to access the same instance of the mutex. The fact that the variable name of two mutexes is the same doesn't mean they are the same mutex.

sylvaticus
2nd July 2009, 16:09
I created the Mutex in the main thread and gived threads a shared pointer, but still crashing (when I set nThreads=1 it's all going fine)

Now the code looks like that:


void
Manager_farmers::landAllocation(){
// .. serial code ..

QMutex* mutex = new QMutex;
for (int i=0; i < nThreads; i++){
Manager_farmers_threads* wThread = new Manager_farmers_threads;
wThreads.push_back(wThread);
}

// parallelise the code to z workingThreads..
for (uint y=0;y<myAgents.size();y++){
bool assigned = false;
while(!assigned) {
for (uint z=0;z<wThreads.size();z++){
if (!wThreads[z]->isRunning()){
wThreads[z]->assignJob(myAgents[y], freePlot, mutex);
wThreads[z]->start();
assigned = true;
break;
}
}
}
}

// be sure that all threads are ended..
for (int z=0; z < nThreads; z++){
wThreads[z]->wait();
}

// ..continue with serial code...

for (int i=0; i < nThreads; i++){
delete wThreads[i]; // deleting the threads
}
delete mutex;
}

void
Manager_farmers_threads::assignJob(Agent_space* agent_h, const Pixel* plot_h, QMutex* mutex_h){
// .. this code shuld be run serialised by the main thread..
agent = agent_h;
plot = plot_h;
mutex = mutex_h;
agent->cachedOffer = 0;
}

void
Manager_farmers_threads::run(){
// This code should be run parallelised...
mutex.lock();
agent->cachedOffer = agent->offerRentalPrice(plot);
mutex.unlock();
}


class Manager_farmers_threads : public QThread {
Q_OBJECT

public:
Manager_farmers_threads();
void assignJob(Agent_space* agent_h, const Pixel*plot_h, QMutex* mutex_h);

protected:
void run();

private:
volatile Agent_space* agent;
const Pixel* plot;
QMutex* mutex;
};

wysota
2nd July 2009, 16:19
What is the Pixel class?

sylvaticus
2nd July 2009, 16:39
Thank you for your answers.
"Pixel" it's a non-gui class that describes the space in my model.
Basically the parallelised code needs just read-only data on information stored on "pixels" and the idea is to eventually protect with mutex this kind of parts only, but the program crash even if I lock/unlock ALL the parallelised code (this is the funy side, becasue according on how I understood mutex looking there should not be any parallelised code when you lock it all ) .
Rendering of the maps on screen is done much later on in the serial part of the model.

Any how, this is the documentation for the Pixel class:
http://www.regmas.org/doc/referenceManual/html/classPixel.html

wysota
2nd July 2009, 18:07
Where exactly does the application crash? What does the debugger say?

sylvaticus
3rd July 2009, 09:38
Sorry for the late response (I moved to an area without internet :-( ).

The application crash when I use a call to a mathematical programming library named "glpk".

"Program received signal SIGABRT (Aborted)"

// solving the problem with the integer variables
stat=lpx_integer(lpglpk);

The re-entrant capability of this library is not assured, but the point is that it is wholly called from agent->cachedOffer = agent->offerRentalPrice(plot); that is protected by a mutex.
By the way all the steps of solving the mathematical problem (initialise the problem, solving, getting the solution and deleting the problem) are done within the "offerRentalPrice()" function.
So my question is: why even if I place a mutex this function is executed concurrently?

wysota
3rd July 2009, 12:14
Please introduce debug statements around locking and unlocking the mutex so that you can check the sequence of the calls. You can use QThread::currentThreadId() to provide information which thread is performing the lock/unlock operation.

sylvaticus
3rd July 2009, 14:26
I put debugging messages around the two lock/unlocks:


void
Manager_farmers_threads::run(){
int threadId = currentThreadId();
cout <<"A "<<threadId<<endl;
mutex->lock();
cout <<"B "<<threadId<<endl;

( (Agent_farmer*) agent)->cachedOffer = ( (Agent_farmer*) agent)->offerRentalPrice(plot);

cout <<"C "<<threadId<<endl;
mutex->unlock();
cout <<"D "<<threadId<<endl;
}


I'm attaching the output log...

This is an other log with only ABCD:


************************************************** *****************
*** !! Welcome to RegMAS - Regional Multi Agent Simulator !! ***
*** For info & doc: http://www.regmas.org/doc ***
*** Compiled on: Jul 3 2009 - 14:10:18 ***
************************************************** *****************

Selected scenario: default
**WARNING: No elements in the XML file. Expected at least one of type reclassRule
**WARNING: No elements in the XML file. Expected at least one of type reclassRule
ABCDABCDABCDABCDABCDABCDABACDBCDABCDABACDBCDABCDAB ACDBCDABCDABACDBCDAB


The external library crash only if I use nThreads > 1.
Furthermore, the second log were generated using the original, non-safe library, while the first log - crashing much later during the execution - were generated using a dev version of that library that is going in the direction of thread-safe.

So my question is: if I have a thread-unsafe library and I try to access it from different threads, even if I access it in a serialised way (using mutexes), it could be any how dangerous.
Is this sentence correct?

wysota
3rd July 2009, 14:35
It seems that locking works fine. If the library uses some internal static data, the data might become corrupted if it is overwritten with another cycle of a function that operates on the data before the data is actually used somewhere. But it is impossible to determine that without knowing the structure of the library. You might try debugging the library itself.

sylvaticus
2nd December 2009, 11:12
I managed to create a test case where I don't use any mutex, just a SINGLE separate thread that call the external library (glpk) solving an example problem.
I use this thread in a loop that assigns new jobs to it when it is not busy.
Glpk crashes when comparing the elapsed time using CPU clocks.

The glpk maintainer told me that he doesn't think it is a glpk bug, so I am last with a problem either in my code (very likely) or in the QT thread library.
You can find attached a small test case, but the core of the issue is in the above code:



#include <glpk.h>
#include <Thread.h>

using namespace std;

int main(int argc, char *argv[]){

cout << "Test on glpk using Qt Threads" << endl<<endl;

Thread wThread; // a simple subclass of QThread
for (uint y=0;y<500;y++){
bool assigned = false;
while(!assigned) {
if (!wThread.isRunning()){
wThread.start(); // job done from the other thread. Just solve a simple linear programming problem (glpk documentation example)
assigned = true;
break;
}
}
}

cout <<"All done without errors (crashing)"<<endl;

return 0;
}


Am I doing something very naif???

sylvaticus
2nd December 2009, 12:14
I managed to retrieve the error also doing a more simple loop:



int main(int argc, char *argv[]){

cout << "Test on a glpk application using Qt Threads" << endl<<endl;
Thread wThread;
for (uint y=0;y<5000;y++){
wThread.start(); // job done from the other thread
wThread.wait();
}
cout <<"All done without errors (crashing)"<<endl;
return 0;
}


The error I have from glpk is:


Assertion failed: xlcmp(env->t_last, t) <= 0
Error detected in file glplib10.c (http://grecof2.econ.univpm.it/~lobianco/public/glplib10.c) at line 109


I really do not understand why this external library crash if I put it on an external thread that I run consecutivly while it works whitout problems if I run it in the main thread.

sylvaticus
3rd December 2009, 12:57
It seems a Qt bug (but of course, I'm not sure..)
http://bugreports.qt.nokia.com/browse/QTBUG-684

wysota
3rd December 2009, 13:17
I don't think the bug report has anything to do with your code. The fact that you do use wait() and you do use threads doesn't mean the situation is the same as the one you described. It all depends what your thread does. Anyway you can make a workaround that the main thread waits on a dedicated wait condition. Another thing is whether it makes any sense to do something in another thread if you just block the main thread until it is finished. A producer-consumer pattern looks totally different than your code, above all it doesn't have any while loops.

sylvaticus
3rd December 2009, 13:57
hello, in fact this code is also not working:


Thread wThread; // a simple subclass of QThread
for (uint y=0;y<500;y++){
bool assigned = false;
while(!assigned) {
if (!wThread.isRunning()){
wThread.start(); // job done from the other thread. Just solve a simple linear programming problem (glpk documentation example)
assigned = true;
break;
}
}
}

The idea is to have a vector of threads doing the jobs simultaneously and for doing this I will need a patched version of the glpk library, but I stopped when I noticed that even when I use a single thread I have problems.
Then I tried to put the example on its minimum terms.

The Thread::run() function just does something like:


for (uint i=0;i<3;i++){
glp_prob *lp;
lp = glp_create_prob();
[...]
glp_load_matrix(lp, 9, ia, ja, ar);
glp_simplex(lp, NULL);
[...]
glp_delete_prob(lp);
}


This code works if I call it directly in the main thread (that really in the real application is itself a working thread separated from the GUI), but crash when called from wThread. I thought that even if the bug report refers to the "wait()" function, the cause underlying it would be the same causing the problem in my situation.

wysota
3rd December 2009, 14:20
From what I see you are trying to start 500 threads using the same thread controller object. This is simply bound to fail.

Try this:

class MyRunnable : public QRunnable {
void run(){
for (uint i=0;i<3;i++){
glp_prob *lp;
lp = glp_create_prob();
[...]
glp_load_matrix(lp, 9, ia, ja, ar);
glp_simplex(lp, NULL);
[...]
glp_delete_prob(lp);
}
}
};
//...
for(int i = 0; i<500; i++){
QThreadPool::globalInstance()->start(new MyRunnable);
}

You can call this before you enter the loop to make sure only one thread at once is ran:

QThreadPool::globalInstance()->setMaxThreadCount(1);

sylvaticus
4th December 2009, 11:39
From what I see you are trying to start 500 threads using the same thread controller object. This is simply bound to fail.

Try this:[...]



Thank you.. this method is working. However, I tried debugging my first code (those with the isRunning() check) with QThread::currentThread() and I noticed it was producing only two threads (one main and one working) as expected, not 500, so I still don't know why it wasn't working...