bjam hangs on select (in develop branch)

Hi, I am trying to test Boost.MPI with Intel's implementation and I am stuck while trying to run simple tests through bjam. Bjam is hangs on the select (not pselect ?) instruction of the unix exec_wait. As far as processes are concerned: PID USER PR NI S %CPU TIME+ PPID COMMAND ....................... 16882 alainm 20 0 S 0.0 0:01.61 6507 bjam 16899 alainm 20 0 T 0.0 0:00.00 16882 sh 16900 alainm 20 0 Z 0.0 0:00.00 16899 mpiexec.hydra <defunct> ....... bjam calls a generated shell (below) which calls a mpiexe.hydra which work perfectly fine outside bjam. The mpiexec.hydra dies the the shell refuses to let it go. the shell script, generated by bjam, is: =============================================== [alainm@gurney engine]$ more /proc/16899/cmdline /bin/sh LD_LIBRARY_PATH="/gpfs/scratch/alainm/view/boost/bin.v2/libs/mpi/build/intel-linux/debug:/gpfs/scratch/alainm/view/boost/bin.v2/libs/serialization/build/intel-linux/debug:/softs/ intel/composer_xe_2015.0.090/bin/lib:/softs/intel/composer_xe_2015.0.090/lib/intel64:$LD_LIBRARY_PATH" export LD_LIBRARY_PATH status=0 if test $status -ne 0 ; then echo Skipping test execution due to testing.execute=off exit 0 fi mpiexec.hydra -n 2 "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2" blob > "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.te st/intel-linux/debug/broadcast_stl_test-2-run.output" 2>&1 status=$? echo >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo EXIT STATUS: $status >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" if test $status -eq 0 ; then cp "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel- linux/debug/broadcast_stl_test-2-run" fi verbose=0 if test $status -ne 0 ; then verbose=1 fi if test $verbose -eq 1 ; then echo ====== BEGIN OUTPUT ====== cat "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo ====== END OUTPUT ====== fi exit $status [alainm@gurney engine]$ ================================================= Note that select only test for the subprocess output, at the hanging point mpiexec.hydra is done with its outputs. Any idea ? Alain PS: there was a cmake based project some time ago, is it still active or is bjam here to stay ?

Hi Alian,
I’ve seen this problem before but it appears to affect very few people so I’ve not needed to fix it. Perhaps the time has come to address it.
Was bjam passed a -j option, if so, what was it?
— Noel
On Oct 20, 2014, at 9:33 AM, Alain Miniussi
Hi,
I am trying to test Boost.MPI with Intel's implementation and I am stuck while trying to run simple tests through bjam. Bjam is hangs on the select (not pselect ?) instruction of the unix exec_wait. As far as processes are concerned:
PID USER PR NI S %CPU TIME+ PPID COMMAND ....................... 16882 alainm 20 0 S 0.0 0:01.61 6507 bjam 16899 alainm 20 0 T 0.0 0:00.00 16882 sh 16900 alainm 20 0 Z 0.0 0:00.00 16899 mpiexec.hydra <defunct> .......
bjam calls a generated shell (below) which calls a mpiexe.hydra which work perfectly fine outside bjam. The mpiexec.hydra dies the the shell refuses to let it go.
the shell script, generated by bjam, is:
=============================================== [alainm@gurney engine]$ more /proc/16899/cmdline /bin/sh LD_LIBRARY_PATH="/gpfs/scratch/alainm/view/boost/bin.v2/libs/mpi/build/intel-linux/debug:/gpfs/scratch/alainm/view/boost/bin.v2/libs/serialization/build/intel-linux/debug:/softs/ intel/composer_xe_2015.0.090/bin/lib:/softs/intel/composer_xe_2015.0.090/lib/intel64:$LD_LIBRARY_PATH" export LD_LIBRARY_PATH
status=0 if test $status -ne 0 ; then echo Skipping test execution due to testing.execute=off exit 0 fi mpiexec.hydra -n 2 "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2" blob > "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.te st/intel-linux/debug/broadcast_stl_test-2-run.output" 2>&1 status=$? echo >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo EXIT STATUS: $status >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" if test $status -eq 0 ; then cp "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel- linux/debug/broadcast_stl_test-2-run" fi verbose=0 if test $status -ne 0 ; then verbose=1 fi if test $verbose -eq 1 ; then echo ====== BEGIN OUTPUT ====== cat "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo ====== END OUTPUT ====== fi exit $status
[alainm@gurney engine]$ =================================================
Note that select only test for the subprocess output, at the hanging point mpiexec.hydra is done with its outputs.
Any idea ?
Alain
PS: there was a cmake based project some time ago, is it still active or is bjam here to stay ?
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Hi Noel, No, no -j option. I tried the -p (since bjam is hangin in a select on output streams) with no effect. I don't know if that's relevant but it seems that most calls to setpgid (and those on the sh process) sets errno to 13 (permission problem).. The select is waiting (without -p) on the stdout of the 'sh' process (wit the redirected stderr). If I replace mpiexec.hydra (a binary) with mpirun (a wrapper around that binary) only mpiexec.hydra will be defunct. PID USER PR NI S %CPU TIME+ PPID COMMAND 769 alainm 20 0 S 0.0 0:02.79 768 bjam 1028 alainm 20 0 T 0.0 0:00.00 769 sh 1029 alainm 20 0 T 0.0 0:00.00 1028 mpirun 1034 alainm 20 0 Z 0.0 0:00.00 1029 mpiexec.hydra <defunct> Alain On 20/10/2014 19:10, Belcourt, Kenneth wrote:
Hi Alian,
I’ve seen this problem before but it appears to affect very few people so I’ve not needed to fix it. Perhaps the time has come to address it.
Was bjam passed a -j option, if so, what was it?
— Noel
On Oct 20, 2014, at 9:33 AM, Alain Miniussi
wrote: Hi,
I am trying to test Boost.MPI with Intel's implementation and I am stuck while trying to run simple tests through bjam. Bjam is hangs on the select (not pselect ?) instruction of the unix exec_wait. As far as processes are concerned:
PID USER PR NI S %CPU TIME+ PPID COMMAND ....................... 16882 alainm 20 0 S 0.0 0:01.61 6507 bjam 16899 alainm 20 0 T 0.0 0:00.00 16882 sh 16900 alainm 20 0 Z 0.0 0:00.00 16899 mpiexec.hydra <defunct> .......
bjam calls a generated shell (below) which calls a mpiexe.hydra which work perfectly fine outside bjam. The mpiexec.hydra dies the the shell refuses to let it go.
the shell script, generated by bjam, is:
=============================================== [alainm@gurney engine]$ more /proc/16899/cmdline /bin/sh LD_LIBRARY_PATH="/gpfs/scratch/alainm/view/boost/bin.v2/libs/mpi/build/intel-linux/debug:/gpfs/scratch/alainm/view/boost/bin.v2/libs/serialization/build/intel-linux/debug:/softs/ intel/composer_xe_2015.0.090/bin/lib:/softs/intel/composer_xe_2015.0.090/lib/intel64:$LD_LIBRARY_PATH" export LD_LIBRARY_PATH
status=0 if test $status -ne 0 ; then echo Skipping test execution due to testing.execute=off exit 0 fi mpiexec.hydra -n 2 "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2" blob > "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.te st/intel-linux/debug/broadcast_stl_test-2-run.output" 2>&1 status=$? echo >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo EXIT STATUS: $status >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" if test $status -eq 0 ; then cp "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel- linux/debug/broadcast_stl_test-2-run" fi verbose=0 if test $status -ne 0 ; then verbose=1 fi if test $verbose -eq 1 ; then echo ====== BEGIN OUTPUT ====== cat "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo ====== END OUTPUT ====== fi exit $status
[alainm@gurney engine]$ =================================================
Note that select only test for the subprocess output, at the hanging point mpiexec.hydra is done with its outputs.
Any idea ?
Alain
PS: there was a cmake based project some time ago, is it still active or is bjam here to stay ?
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

On Oct 20, 2014, at 4:02 PM, Alain Miniussi
Hi Noel,
No, no -j option.
Interesting. I’ve usually seen bjam miss the subprocess termination signal when -j is around 64 or more. I’ve got an Intel MPI setup I can try to reproduce than Zombie child with. This is code I added quite a few years ago so I’ll have to dust off my bjam hat and track this down. Sorry about the hassle, it might take me a few days before I can debug this. — Noel
I tried the -p (since bjam is hangin in a select on output streams) with no effect. I don't know if that's relevant but it seems that most calls to setpgid (and those on the sh process) sets errno to 13 (permission problem).. The select is waiting (without -p) on the stdout of the 'sh' process (wit the redirected stderr). If I replace mpiexec.hydra (a binary) with mpirun (a wrapper around that binary) only mpiexec.hydra will be defunct.
PID USER PR NI S %CPU TIME+ PPID COMMAND 769 alainm 20 0 S 0.0 0:02.79 768 bjam 1028 alainm 20 0 T 0.0 0:00.00 769 sh 1029 alainm 20 0 T 0.0 0:00.00 1028 mpirun 1034 alainm 20 0 Z 0.0 0:00.00 1029 mpiexec.hydra <defunct>
Alain
On 20/10/2014 19:10, Belcourt, Kenneth wrote:
Hi Alian,
I’ve seen this problem before but it appears to affect very few people so I’ve not needed to fix it. Perhaps the time has come to address it.
Was bjam passed a -j option, if so, what was it?
— Noel
On Oct 20, 2014, at 9:33 AM, Alain Miniussi
wrote: Hi,
I am trying to test Boost.MPI with Intel's implementation and I am stuck while trying to run simple tests through bjam. Bjam is hangs on the select (not pselect ?) instruction of the unix exec_wait. As far as processes are concerned:
PID USER PR NI S %CPU TIME+ PPID COMMAND ....................... 16882 alainm 20 0 S 0.0 0:01.61 6507 bjam 16899 alainm 20 0 T 0.0 0:00.00 16882 sh 16900 alainm 20 0 Z 0.0 0:00.00 16899 mpiexec.hydra <defunct> .......
bjam calls a generated shell (below) which calls a mpiexe.hydra which work perfectly fine outside bjam. The mpiexec.hydra dies the the shell refuses to let it go.
the shell script, generated by bjam, is:
=============================================== [alainm@gurney engine]$ more /proc/16899/cmdline /bin/sh LD_LIBRARY_PATH="/gpfs/scratch/alainm/view/boost/bin.v2/libs/mpi/build/intel-linux/debug:/gpfs/scratch/alainm/view/boost/bin.v2/libs/serialization/build/intel-linux/debug:/softs/ intel/composer_xe_2015.0.090/bin/lib:/softs/intel/composer_xe_2015.0.090/lib/intel64:$LD_LIBRARY_PATH" export LD_LIBRARY_PATH
status=0 if test $status -ne 0 ; then echo Skipping test execution due to testing.execute=off exit 0 fi mpiexec.hydra -n 2 "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2" blob > "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.te st/intel-linux/debug/broadcast_stl_test-2-run.output" 2>&1 status=$? echo >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo EXIT STATUS: $status >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" if test $status -eq 0 ; then cp "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel- linux/debug/broadcast_stl_test-2-run" fi verbose=0 if test $status -ne 0 ; then verbose=1 fi if test $verbose -eq 1 ; then echo ====== BEGIN OUTPUT ====== cat "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo ====== END OUTPUT ====== fi exit $status
[alainm@gurney engine]$ =================================================
Note that select only test for the subprocess output, at the hanging point mpiexec.hydra is done with its outputs.
Any idea ?
Alain
PS: there was a cmake based project some time ago, is it still active or is bjam here to stay ?
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

On Oct 20, 2014, at 4:11 PM, Belcourt, Kenneth
On Oct 20, 2014, at 4:02 PM, Alain Miniussi
wrote: No, no -j option.
Interesting. I’ve usually seen bjam miss the subprocess termination signal when -j is around 64 or more. I’ve got an Intel MPI setup I can try to reproduce than Zombie child with.
Just pushed a fix to develop: commit 252b5aa019 Can you check if this fixes your issue? — Noel

Sorry, the problem is still here: 6817 alainm 20 0 S 0.0 0:01.74 4517 b2 6870 alainm 20 0 T 0.0 0:00.00 6817 sh 6871 alainm 20 0 T 0.0 0:00.00 6870 mpirun 6876 alainm 20 0 Z 0.0 0:00.00 6871 mpiexec.hydra <defunct> bottom of b2 strace: lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 select(5, [4], NULL, NULL, NULL On 21/10/2014 08:29, Belcourt, Kenneth wrote:
On Oct 20, 2014, at 4:11 PM, Belcourt, Kenneth
wrote: On Oct 20, 2014, at 4:02 PM, Alain Miniussi
wrote: No, no -j option. Interesting. I’ve usually seen bjam miss the subprocess termination signal when -j is around 64 or more. I’ve got an Intel MPI setup I can try to reproduce than Zombie child with. Just pushed a fix to develop:
commit 252b5aa019
Can you check if this fixes your issue?
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

On Oct 21, 2014, at 6:35 AM, Alain Miniussi
Sorry, the problem is still here: 6817 alainm 20 0 S 0.0 0:01.74 4517 b2 6870 alainm 20 0 T 0.0 0:00.00 6817 sh 6871 alainm 20 0 T 0.0 0:00.00 6870 mpirun 6876 alainm 20 0 Z 0.0 0:00.00 6871 mpiexec.hydra <defunct>
bottom of b2 strace:
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 select(5, [4], NULL, NULL, NULL
Okay, that’s helpful. Let me try a couple of other things. Thanks Alain. — Noel

Hi, I don't know if it can help, but I attached a minimal (ok, let say small) example that reproduces the problem without the mpi test nor boost code. It's basically a minimized ~100loc version of bjam. I sent it to intel so they can investigate, since mpiexec.hydra might be part of the problem. Although I think bjam should be able to deal with it since the mpi test passes on the command line. Alain On 21/10/2014 15:56, Belcourt, Kenneth wrote:
On Oct 21, 2014, at 6:35 AM, Alain Miniussi
wrote: Sorry, the problem is still here: 6817 alainm 20 0 S 0.0 0:01.74 4517 b2 6870 alainm 20 0 T 0.0 0:00.00 6817 sh 6871 alainm 20 0 T 0.0 0:00.00 6870 mpirun 6876 alainm 20 0 Z 0.0 0:00.00 6871 mpiexec.hydra <defunct>
bottom of b2 strace:
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 select(5, [4], NULL, NULL, NULL Okay, that’s helpful. Let me try a couple of other things. Thanks Alain.
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

Sorry, I probably screw up something with the last test, please ignore the minimized code. On 24/10/2014 12:08, Alain Miniussi wrote:
Hi,
I don't know if it can help, but I attached a minimal (ok, let say small) example that reproduces the problem without the mpi test nor boost code. It's basically a minimized ~100loc version of bjam. I sent it to intel so they can investigate, since mpiexec.hydra might be part of the problem. Although I think bjam should be able to deal with it since the mpi test passes on the command line.
Alain
On 21/10/2014 15:56, Belcourt, Kenneth wrote:
On Oct 21, 2014, at 6:35 AM, Alain Miniussi
wrote: Sorry, the problem is still here: 6817 alainm 20 0 S 0.0 0:01.74 4517 b2 6870 alainm 20 0 T 0.0 0:00.00 6817 sh 6871 alainm 20 0 T 0.0 0:00.00 6870 mpirun 6876 alainm 20 0 Z 0.0 0:00.00 6871 mpiexec.hydra <defunct>
bottom of b2 strace:
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 select(5, [4], NULL, NULL, NULL Okay, that’s helpful. Let me try a couple of other things. Thanks Alain.
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. Can the fact that the setgpid system calls fails be an issue ? I notice they are among the few sys call those return code is not tested (under gdb, I noticed they return 13 (PERM issue)). Alain On 21/10/2014 15:56, Belcourt, Kenneth wrote:
On Oct 21, 2014, at 6:35 AM, Alain Miniussi
wrote: Sorry, the problem is still here: 6817 alainm 20 0 S 0.0 0:01.74 4517 b2 6870 alainm 20 0 T 0.0 0:00.00 6817 sh 6871 alainm 20 0 T 0.0 0:00.00 6870 mpirun 6876 alainm 20 0 Z 0.0 0:00.00 6871 mpiexec.hydra <defunct>
bottom of b2 strace:
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 select(5, [4], NULL, NULL, NULL Okay, that’s helpful. Let me try a couple of other things. Thanks Alain.
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

On 24/10/2014 15:33, Alain Miniussi wrote:
I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs.
Can the fact that the setgpid system calls fails be an issue ? I notice they are among the few sys call those return code is not tested (under gdb, I noticed they return 13 (PERM issue)).
Alain
On 21/10/2014 15:56, Belcourt, Kenneth wrote:
On Oct 21, 2014, at 6:35 AM, Alain Miniussi
wrote: Sorry, the problem is still here: 6817 alainm 20 0 S 0.0 0:01.74 4517 b2 6870 alainm 20 0 T 0.0 0:00.00 6817 sh 6871 alainm 20 0 T 0.0 0:00.00 6870 mpirun 6876 alainm 20 0 Z 0.0 0:00.00 6871 mpiexec.hydra <defunct>
bottom of b2 strace:
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 select(5, [4], NULL, NULL, NULL Okay, that’s helpful. Let me try a couple of other things. Thanks Alain.
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

Hi Alian,
On Oct 24, 2014, at 7:56 AM, Alain Miniussi
On 24/10/2014 15:33, Alain Miniussi wrote:
I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs.
Can the fact that the setgpid system calls fails be an issue ?
Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle. For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently. I’m still looking at this. — Noel
I notice they are among the few sys call those return code is not tested (under gdb, I noticed they return 13 (PERM issue)).
Alain
On 21/10/2014 15:56, Belcourt, Kenneth wrote:
On Oct 21, 2014, at 6:35 AM, Alain Miniussi
wrote: Sorry, the problem is still here: 6817 alainm 20 0 S 0.0 0:01.74 4517 b2 6870 alainm 20 0 T 0.0 0:00.00 6817 sh 6871 alainm 20 0 T 0.0 0:00.00 6870 mpirun 6876 alainm 20 0 Z 0.0 0:00.00 6871 mpiexec.hydra <defunct>
bottom of b2 strace:
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0 select(5, [4], NULL, NULL, NULL Okay, that’s helpful. Let me try a couple of other things. Thanks Alain.
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth
On Oct 24, 2014, at 7:56 AM, Alain Miniussi
wrote: On 24/10/2014 15:33, Alain Miniussi wrote:
I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs.
Can the fact that the setgpid system calls fails be an issue ?
Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle.
For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently.
Just pushed this commit, 7bcbc5ac31ab1, to develop which adds checks to the setpgid calls and, if they fail, indicates whether it was the parent or child process who called. Can you give this a try and let me know which call is failing? — Noel

On Oct 24, 2014, at 4:43 PM, Belcourt, Kenneth
On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 7:56 AM, Alain Miniussi
wrote: On 24/10/2014 15:33, Alain Miniussi wrote:
I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs.
Can the fact that the setgpid system calls fails be an issue ?
Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle.
For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently.
Just pushed this commit, 7bcbc5ac31ab1, to develop which adds checks to the setpgid calls and, if they fail, indicates whether it was the parent or child process who called. Can you give this a try and let me know which call is failing?
Well I be danged. I was just testing thie change on my Mac and found this in the output: setpgid (parent): Permission denied So it seems we’ve been ignoring this problem for some time and didn’t know it. That would be my bad. Let me work on a fix (will probably remove the duplicate call in the parent process). — Noel

On Oct 24, 2014, at 4:52 PM, Belcourt, Kenneth
On Oct 24, 2014, at 4:43 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 7:56 AM, Alain Miniussi
wrote: On 24/10/2014 15:33, Alain Miniussi wrote:
I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs.
Can the fact that the setgpid system calls fails be an issue ?
Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle.
For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently.
Just pushed this commit, 7bcbc5ac31ab1, to develop which adds checks to the setpgid calls and, if they fail, indicates whether it was the parent or child process who called. Can you give this a try and let me know which call is failing?
Well I be danged. I was just testing thie change on my Mac and found this in the output:
setpgid (parent): Permission denied
So it seems we’ve been ignoring this problem for some time and didn’t know it. That would be my bad. Let me work on a fix (will probably remove the duplicate call in the parent process).
I left both setpgid checks in, but removed the call to exit() so we’ll see the failed call to setpgid without killing b2. commit 156bc5c42ec3 in develop. — Noel

On 25/10/2014 02:14, Belcourt, Kenneth wrote:
On Oct 24, 2014, at 4:52 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:43 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 7:56 AM, Alain Miniussi
wrote: On 24/10/2014 15:33, Alain Miniussi wrote:
I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs. Can the fact that the setgpid system calls fails be an issue ? Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle.
For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently. Just pushed this commit, 7bcbc5ac31ab1, to develop which adds checks to the setpgid calls and, if they fail, indicates whether it was the parent or child process who called. Can you give this a try and let me know which call is failing? Well I be danged. I was just testing thie change on my Mac and found this in the output:
setpgid (parent): Permission denied
So it seems we’ve been ignoring this problem for some time and didn’t know it. That would be my bad. Let me work on a fix (will probably remove the duplicate call in the parent process). I left both setpgid checks in, but removed the call to exit() so we’ll see the failed call to setpgid without killing b2.
commit 156bc5c42ec3 in develop.
Thanks, So the mpiexe.hydra is still defunct *but* I have something new: Let say I am in the following situation: PID PPID 20104 alainm 20 0 S 0.0 0:13.33 17184 bjam 20170 alainm 20 0 T 0.0 0:00.00 20104 sh 20171 alainm 20 0 Z 0.0 0:00.00 20170 mpiexec.hydra <defunct> [alainm@gurney ~]$ pstree 20104 bjam───sh───mpiexec.hydra [alainm@gurney ~]$ So, mpiexe is dead, the calling shell should take notice, but somehow doesn't. It just wait, but with no conviction: $ gdb /bin/sh 20170 ................ (gdb) bt #0 0x0000003bd92ac8ce in __libc_waitpid (pid=-1, stat_loc=0x7fff344888bc, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:32 #1 0x000000000043ec82 in waitchld (wpid=<value optimized out>, block=1) at jobs.c:3064 #2 0x000000000043ff1f in wait_for (pid=20171) at jobs.c:2422 #3 0x00000000004309f9 in execute_command_internal (command=0x18beda0, the interesting thing is that, if y just entre a <continue> command under gdb, then the bjam magically proceed up to the next mpiexec. Which gave me the idea to just $ kill -CONT <shell id> to get see the next target proceed. So my current theory is that the mpiexec.hydra pauses it calling process by sending it a STOP signal (why would it do that ? I have no clue) and then exit without sending a CONTINUE signal. Maybe signaling the child from exec_cmd just before the select would be a solution, but it looks like a pretty ugly one... Alain
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

On 27/10/2014 16:32, Alain Miniussi wrote:
On 25/10/2014 02:14, Belcourt, Kenneth wrote:
On Oct 24, 2014, at 4:52 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:43 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 7:56 AM, Alain Miniussi
wrote: On 24/10/2014 15:33, Alain Miniussi wrote: > I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same > problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs. > Can the fact that the setgpid system calls fails be an issue ? Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle.
For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently. Just pushed this commit, 7bcbc5ac31ab1, to develop which adds checks to the setpgid calls and, if they fail, indicates whether it was the parent or child process who called. Can you give this a try and let me know which call is failing? Well I be danged. I was just testing thie change on my Mac and found this in the output:
setpgid (parent): Permission denied
So it seems we’ve been ignoring this problem for some time and didn’t know it. That would be my bad. Let me work on a fix (will probably remove the duplicate call in the parent process). I left both setpgid checks in, but removed the call to exit() so we’ll see the failed call to setpgid without killing b2.
commit 156bc5c42ec3 in develop.
Thanks,
So the mpiexe.hydra is still defunct *but* I have something new: Let say I am in the following situation:
PID PPID 20104 alainm 20 0 S 0.0 0:13.33 17184 bjam 20170 alainm 20 0 T 0.0 0:00.00 20104 sh 20171 alainm 20 0 Z 0.0 0:00.00 20170 mpiexec.hydra <defunct> [alainm@gurney ~]$ pstree 20104 bjam───sh───mpiexec.hydra [alainm@gurney ~]$
So, mpiexe is dead, the calling shell should take notice, but somehow doesn't. It just wait, but with no conviction: $ gdb /bin/sh 20170 ................ (gdb) bt #0 0x0000003bd92ac8ce in __libc_waitpid (pid=-1, stat_loc=0x7fff344888bc, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:32 #1 0x000000000043ec82 in waitchld (wpid=<value optimized out>, block=1) at jobs.c:3064 #2 0x000000000043ff1f in wait_for (pid=20171) at jobs.c:2422 #3 0x00000000004309f9 in execute_command_internal (command=0x18beda0,
the interesting thing is that, if y just entre a <continue> command under gdb, then the bjam magically proceed up to the next mpiexec.
Which gave me the idea to just $ kill -CONT <shell id> to get see the next target proceed.
So my current theory is that the mpiexec.hydra pauses it calling process by sending it a STOP signal (why would it do that ? I have no clue) and then exit without sending a CONTINUE signal.
Maybe signaling the child from exec_cmd just before the select would be a solution, but it looks like a pretty ugly one...
Ok, I might have a fix (nothing to be proud of though) that basically consist in inserting: for ( i = 0; i < globs.jobs; ++i ) { if ( cmdtab[ i ].pid != 0 ) { kill(cmdtab[ i ].pid, SIGCONT); } } at the beginning of exec_wait. Maybe killpg would be better (didn't check), since I'm not sure that a simple kill will deal with mpirun (which add a shell layer between bjam and mpiexec.hydra). I'll try to propose a pull request tonight. Thanks ! Alain
Alain
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

On 27/10/2014 17:13, Alain Miniussi wrote:
On 27/10/2014 16:32, Alain Miniussi wrote:
On 25/10/2014 02:14, Belcourt, Kenneth wrote:
On Oct 24, 2014, at 4:52 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:43 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 7:56 AM, Alain Miniussi
wrote: > On 24/10/2014 15:33, Alain Miniussi wrote: >> I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the >> same problem. > It did not, just forgot to edit a field in project-config.jam. > Only the intel mpiexec/run hangs. >> Can the fact that the setgpid system calls fails be an issue ? Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle.
For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently. Just pushed this commit, 7bcbc5ac31ab1, to develop which adds checks to the setpgid calls and, if they fail, indicates whether it was the parent or child process who called. Can you give this a try and let me know which call is failing? Well I be danged. I was just testing thie change on my Mac and found this in the output:
setpgid (parent): Permission denied
So it seems we’ve been ignoring this problem for some time and didn’t know it. That would be my bad. Let me work on a fix (will probably remove the duplicate call in the parent process). I left both setpgid checks in, but removed the call to exit() so we’ll see the failed call to setpgid without killing b2.
commit 156bc5c42ec3 in develop.
Thanks,
So the mpiexe.hydra is still defunct *but* I have something new: Let say I am in the following situation:
PID PPID 20104 alainm 20 0 S 0.0 0:13.33 17184 bjam 20170 alainm 20 0 T 0.0 0:00.00 20104 sh 20171 alainm 20 0 Z 0.0 0:00.00 20170 mpiexec.hydra <defunct> [alainm@gurney ~]$ pstree 20104 bjam───sh───mpiexec.hydra [alainm@gurney ~]$
So, mpiexe is dead, the calling shell should take notice, but somehow doesn't. It just wait, but with no conviction: $ gdb /bin/sh 20170 ................ (gdb) bt #0 0x0000003bd92ac8ce in __libc_waitpid (pid=-1, stat_loc=0x7fff344888bc, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:32 #1 0x000000000043ec82 in waitchld (wpid=<value optimized out>, block=1) at jobs.c:3064 #2 0x000000000043ff1f in wait_for (pid=20171) at jobs.c:2422 #3 0x00000000004309f9 in execute_command_internal (command=0x18beda0,
the interesting thing is that, if y just entre a <continue> command under gdb, then the bjam magically proceed up to the next mpiexec.
Which gave me the idea to just $ kill -CONT <shell id> to get see the next target proceed.
So my current theory is that the mpiexec.hydra pauses it calling process by sending it a STOP signal (why would it do that ? I have no clue) and then exit without sending a CONTINUE signal.
Maybe signaling the child from exec_cmd just before the select would be a solution, but it looks like a pretty ugly one...
Ok, I might have a fix (nothing to be proud of though) that basically consist in inserting: for ( i = 0; i < globs.jobs; ++i ) { if ( cmdtab[ i ].pid != 0 ) { kill(cmdtab[ i ].pid, SIGCONT); } } at the beginning of exec_wait. Maybe killpg would be better (didn't check), since I'm not sure that a simple kill will deal with mpirun (which add a shell layer between bjam and mpiexec.hydra). I'll try to propose a pull request tonight.
More complicated than it looked, the pause signal can be sent at any time, so the wake up call probably need interleaved with the select.
Thanks !
Alain
Alain
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain

Pull request 47 (https://github.com/boostorg/build/pull/47) seems to fix the (well, my) problem. On 25/10/2014 02:14, Belcourt, Kenneth wrote:
On Oct 24, 2014, at 4:52 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:43 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 4:33 PM, Belcourt, Kenneth
wrote: On Oct 24, 2014, at 7:56 AM, Alain Miniussi
wrote: On 24/10/2014 15:33, Alain Miniussi wrote:
I did a gnu/openmpi 1.8.2 build on ubuntu which exhibit the same problem. It did not, just forgot to edit a field in project-config.jam. Only the intel mpiexec/run hangs. Can the fact that the setgpid system calls fails be an issue ? Perhaps. We make the forked child process it’s own process group leader so that if it’s an MPI job and it dies, all the MPI ranks are cleaned up as well. We’ve been using this syntax for a number of years on multiple platforms without issues so I’m a little surprised it fails on Ubuntu with OpenMPI 1.8.2 That said, it’s possible that there’s a race condition that you’re able to tickle.
For example, we fork the child process and right before we exec the child process, we set the child process group. We also set the child process group in the parent process as well. Perhaps we should on do this once, not twice (i.e. only in the child or only in the parent, not both). Or perhaps there’s a race if both the child and parent call to setpgid runs concurrently. Just pushed this commit, 7bcbc5ac31ab1, to develop which adds checks to the setpgid calls and, if they fail, indicates whether it was the parent or child process who called. Can you give this a try and let me know which call is failing? Well I be danged. I was just testing thie change on my Mac and found this in the output:
setpgid (parent): Permission denied
So it seems we’ve been ignoring this problem for some time and didn’t know it. That would be my bad. Let me work on a fix (will probably remove the duplicate call in the parent process). I left both setpgid checks in, but removed the call to exit() so we’ll see the failed call to setpgid without killing b2.
commit 156bc5c42ec3 in develop.
— Noel
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain
participants (2)
-
Alain Miniussi
-
Belcourt, Kenneth