
On Oct 20, 2014, at 4:02 PM, Alain Miniussi
Hi Noel,
No, no -j option.
Interesting. I’ve usually seen bjam miss the subprocess termination signal when -j is around 64 or more. I’ve got an Intel MPI setup I can try to reproduce than Zombie child with. This is code I added quite a few years ago so I’ll have to dust off my bjam hat and track this down. Sorry about the hassle, it might take me a few days before I can debug this. — Noel
I tried the -p (since bjam is hangin in a select on output streams) with no effect. I don't know if that's relevant but it seems that most calls to setpgid (and those on the sh process) sets errno to 13 (permission problem).. The select is waiting (without -p) on the stdout of the 'sh' process (wit the redirected stderr). If I replace mpiexec.hydra (a binary) with mpirun (a wrapper around that binary) only mpiexec.hydra will be defunct.
PID USER PR NI S %CPU TIME+ PPID COMMAND 769 alainm 20 0 S 0.0 0:02.79 768 bjam 1028 alainm 20 0 T 0.0 0:00.00 769 sh 1029 alainm 20 0 T 0.0 0:00.00 1028 mpirun 1034 alainm 20 0 Z 0.0 0:00.00 1029 mpiexec.hydra <defunct>
Alain
On 20/10/2014 19:10, Belcourt, Kenneth wrote:
Hi Alian,
I’ve seen this problem before but it appears to affect very few people so I’ve not needed to fix it. Perhaps the time has come to address it.
Was bjam passed a -j option, if so, what was it?
— Noel
On Oct 20, 2014, at 9:33 AM, Alain Miniussi
wrote: Hi,
I am trying to test Boost.MPI with Intel's implementation and I am stuck while trying to run simple tests through bjam. Bjam is hangs on the select (not pselect ?) instruction of the unix exec_wait. As far as processes are concerned:
PID USER PR NI S %CPU TIME+ PPID COMMAND ....................... 16882 alainm 20 0 S 0.0 0:01.61 6507 bjam 16899 alainm 20 0 T 0.0 0:00.00 16882 sh 16900 alainm 20 0 Z 0.0 0:00.00 16899 mpiexec.hydra <defunct> .......
bjam calls a generated shell (below) which calls a mpiexe.hydra which work perfectly fine outside bjam. The mpiexec.hydra dies the the shell refuses to let it go.
the shell script, generated by bjam, is:
=============================================== [alainm@gurney engine]$ more /proc/16899/cmdline /bin/sh LD_LIBRARY_PATH="/gpfs/scratch/alainm/view/boost/bin.v2/libs/mpi/build/intel-linux/debug:/gpfs/scratch/alainm/view/boost/bin.v2/libs/serialization/build/intel-linux/debug:/softs/ intel/composer_xe_2015.0.090/bin/lib:/softs/intel/composer_xe_2015.0.090/lib/intel64:$LD_LIBRARY_PATH" export LD_LIBRARY_PATH
status=0 if test $status -ne 0 ; then echo Skipping test execution due to testing.execute=off exit 0 fi mpiexec.hydra -n 2 "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2" blob > "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.te st/intel-linux/debug/broadcast_stl_test-2-run.output" 2>&1 status=$? echo >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo EXIT STATUS: $status >> "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" if test $status -eq 0 ; then cp "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel- linux/debug/broadcast_stl_test-2-run" fi verbose=0 if test $status -ne 0 ; then verbose=1 fi if test $verbose -eq 1 ; then echo ====== BEGIN OUTPUT ====== cat "../../../bin.v2/libs/mpi/test/broadcast_stl_test-2.test/intel-linux/debug/broadcast_stl_test-2-run.output" echo ====== END OUTPUT ====== fi exit $status
[alainm@gurney engine]$ =================================================
Note that select only test for the subprocess output, at the hanging point mpiexec.hydra is done with its outputs.
Any idea ?
Alain
PS: there was a cmake based project some time ago, is it still active or is bjam here to stay ?
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- --- Alain
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users