
The isend also returns a request object that you need to call wait on
Matthia
Sent from my iPad
On Jun 29, 2010, at 12:55 AM, Jack Bryan
Thanks for your reply.
I have checked the tags, master and worker tags match.
The deadlock happens in the case of 2 tasks scheduled on one processor.
If there is only one task on one processor, there is no deadlock. It works well.
The master is resopnsible for scheduling tasks to workers, which need to run the assigned tasks and feedback results to master.
if I assign one task to each worker, it works well.
But, when I increase the # of task to 2 on worker node, it is deadlock.
The master only schedules 2 tasks to one worker in order to simplify the analysis for the poential deadlock.
The worker can receive the 2 tasks and run them, but the master cannot get the results from worker.
the main idea:
master (node0)
counter=0; totalTaskNum =2; while (counter < totalTaskNum ) { TaskPackage myTaskPackage(world);
world.isend(node1, downStreamTaskTag, myTaskPackage); recvReqs[counter] = world.irecv(node1, upStreamtaskTag, taskResultPackage[counter]); counter++; } world.wait_all(recvReqs, recvReqs+(totalTaskNum));
worker (node 1):
while(1) { TaskPackage workerTaskPackage(world); world.recv(node0,downStreamTaskTag, workerTaskPackage );
do it local work;
world.isend(node0, upStreamTaskTag, workerTaskPackage);
if (no new task) break; }
My code has many classes, I am trying to find out how to cut out the main part from it.
Any help is appreciated.
thanks
Jack
Date: Mon, 28 Jun 2010 21:28:47 +0200 From: riccardo.murri@gmail.com To: boost-users@lists.boost.org Subject: Re: [Boost-users] boostMPI asychronous communication
Hello Jack,
On Mon, Jun 28, 2010 at 7:46 PM, Jack Bryan
wrote: This is the main part of me code, which may have deadlock.
Master: for (iRank = 0; iRank < availableRank ; iRank++) { destRank = iRank+1; for (taski = 1; taski <= TaskNumPerRank ; taski++) { resultSourceRank = destRank; recvReqs[taskCounterT2] = world.irecv(resultSourceRank, upStreamTaskTag, resultTaskPackageT2[iRank][taskCounterT3]); reqs = world.isend(destRank, taskTag, myTaskPackage); ++taskCounterT2; }
// taskTotalNum = availableRank * TaskNumPerRank // right now, availableRank =1, TaskNumPerRank =2 mpi::wait_all(recvReqs, recvReqs+(taskTotalNum)); ----------------------------------------------- worker: while (1) { world.recv(managerRank, downStreamTaskTag, resultTaskPackageW); do its local work on received task; destRank = masterRank; reqs = world.isend(destRank, taskTag, myTaskPackage); if (recv end signal) break; }
1. I can't see where the outer for-loop in master is closed; is the wait_all() part of that loop? (I assume it does not.) Can you send a minimal program that I can feed to a compiler and test? This could help.
2. Are you sure there is no tag mismatch between master and worker?
master: world.isend(destRank, taskTag, myTaskPackage); ^^^^^^^ worker: world.recv(managerRank, downStreamTaskTag, resultTaskPackageW); ^^^^^^^^^^^^^^^^^
unless master::taskTag == worker::downStreamTaskTag, the recv() will wait forever.
Similarly, the following requires that master::upStreamTaskTag == worker::taskTag:
master: ... = world.irecv(resultSourceRank, upStreamTaskTag, ...); worker: world.isend(destRank, taskTag, myTaskPackage); // destRank==masterRank
3. Do the source/destination ranks match? The master waits for messages from destinations 1..availableRank (inclusive range), and the worker waits for a message from "masterRank" (is this 0?)
4. Does the master work if you replace the main loop with the following?
Master: for (iRank = 0; iRank < availableRank ; iRank++) { destRank = iRank+1; for (taski = 1; taski <= TaskNumPerRank ; taski++) { // XXX: the following code does not contain any reference to // "taski": it is sending "TaskNumPerRank" copies of the // same message ... reqs = world.isend(destRank, taskTag, myTaskPackage); }; }; // I assume the outer loop does *not* include the wait_all()
// expect a message from each Task int n = 0; while (n < taskTotalNum) { mpi::status status = world.probe(); world.recv(status.source(), status.tag(), resultTaskPackageT2[status.source()][taskCounterT3]); ++n; };
Best regards, Riccardo _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy. _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users