
On Oct 21, 2007, at 4:36 AM, Aydın Buluç wrote:
I was a happy user of Boost.MPI that was working together with the underlying OpenMPI. However, MPI_THREAD_MULTIPLE support of OpenMPI is quite clumsy and it hangs frequently (OpenMPI developers call this as being "lightly tested"). So, I installed MPICH2 that claims to supports thread-safe MPI.
I've heard that MPICH2's support for MPI_THREAD_MULTIPLE is pretty good. Okay, so just a fair warning... at present, Boost.MPI will *not* work correctly in a multi-threaded context if you're sending serialized data over the wire. It's possible that two concurrent serialized receives on the same communicator and with the same tag could get confused, most likely causing a crash. The solution isn't trivial, but we know how to solve the problem as soon as we have the chance.
/tmp/ccbyPLBK.o: In function `void boost::mpi::detail::all_gather_impl<int>(boost::mpi::communicator const&, int const*, int, int*, mpl_::bool_<true>)':all_gather_test.cpp: (.gnu.linkonce.t._ZN5boost3mpi6detail15all_gather_implIiEEvRKNS0_12com municatorEPKT_iPS6_N4mpl_5bool_ILb1EEE[void boost::mpi::detail::all_gather_impl<int>(boost::mpi::communicator const&, int const*, int, int*, mpl_::bool_<true>)]+0x1c): undefined reference to `boost::mpi::communicator::operator int() const'
/tmp/ccbyPLBK.o: In function `void boost::mpi::detail::broadcast_impl<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > (boost::mpi::communicator const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, int, mpl_::bool_<false>)':all_gather_test.cpp: (.gnu.linkonce.t._ZN5boost3mpi6detail14broadcast_implISsEEvRKNS0_12com municatorEPT_iiN4mpl_5bool_ILb0EEE[void boost::mpi::detail::broadcast_impl<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > (boost::mpi::communicator const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, int, mpl_::bool_<false>)]+0x27): undefined reference to `boost::mpi::communicator::operator int() const' -------------------------------------
I "quadruple-checked" to see that the versions of mpicxx/libraries/ include directories all belong to the new installation of MPICH2 and not remainders of old OpenMPI installation. Thus, I don't think that is a binary incompatibility issue.
I hate to say it, but we might need a quintuple check :) The implicit conversion from "communicator" to "int" shows that we are certainly getting the MPICH2 headers, because MPI_Comm is just a typedef of int in MPICH2 (in OpenMPI, MPI_Comm is a pointer). However, I'm guessing that the archive that we're linking against was compiled against OpenMPI. When you rebuilt Boost.MPI, did you remove "bin.v2/libs/mpi"? If not, I suggest that you do so and then rebuild Boost.MPI. - Doug