Boost.MPI compilation problems with MPICH2

Hi all, I was a happy user of Boost.MPI that was working together with the underlying OpenMPI. However, MPI_THREAD_MULTIPLE support of OpenMPI is quite clumsy and it hangs frequently (OpenMPI developers call this as being "lightly tested"). So, I installed MPICH2 that claims to supports thread-safe MPI. At the same time, I didn't want to remove my working OpenMPI implementation and my old Boost libraries built on top of that. Therefore, I rebuilt Boost (--with-mpi) using the new mpicxx of MPICH (so that it finds the correct MPICH2 directories, etc) on a different location . Everything went ok during compilation and installation. However, I have compilation errors when trying to run any mpi program. For example, with the test suite provided by Boost.MPI itself, I get errors like (this is for all_gather.cpp specifically but I get similar ones with every other test): ------------------------------------- /tmp/ccbyPLBK.o: In function `void boost::mpi::detail::all_gather_impl<int>(boost::mpi::communicator const&, int const*, int, int*, mpl_::bool_<true>)':all_gather_test.cpp:(.gnu.linkonce.t._ZN5boost3mpi6detail15all_gather_implIiEEvRKNS0_12communicatorEPKT_iPS6_N4mpl_5bool_ILb1EEE[void boost::mpi::detail::all_gather_impl<int>(boost::mpi::communicator const&, int const*, int, int*, mpl_::bool_<true>)]+0x1c): undefined reference to `boost::mpi::communicator::operator int() const' /tmp/ccbyPLBK.o: In function `void boost::mpi::detail::broadcast_impl<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >(boost::mpi::communicator const&, std::basic_string<char, std::char_traits<char>, std::allocator<char>
*, int, int, mpl_::bool_<false>)':all_gather_test.cpp:(.gnu.linkonce.t._ZN5boost3mpi6detail14broadcast_implISsEEvRKNS0_12communicatorEPT_iiN4mpl_5bool_ILb0EEE[void boost::mpi::detail::broadcast_impl<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >(boost::mpi::communicator const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> *, int, int, mpl_::bool_<false>)]+0x27): undefined reference to `boost::mpi::communicator::operator int() const'
I "quadruple-checked" to see that the versions of mpicxx/libraries/include directories all belong to the new installation of MPICH2 and not remainders of old OpenMPI installation. Thus, I don't think that is a binary incompatibility issue. MPICH2 implementation works OK alone (without boost) and the old Boost+OpenMPI installation libraries also still work. But the new Boost+MPICH2 libraries... ihhh :( Any suggestions and pointers will be appreciated. Thanks, -- Aydin

On Oct 21, 2007, at 4:36 AM, Aydın Buluç wrote:
I was a happy user of Boost.MPI that was working together with the underlying OpenMPI. However, MPI_THREAD_MULTIPLE support of OpenMPI is quite clumsy and it hangs frequently (OpenMPI developers call this as being "lightly tested"). So, I installed MPICH2 that claims to supports thread-safe MPI.
I've heard that MPICH2's support for MPI_THREAD_MULTIPLE is pretty good. Okay, so just a fair warning... at present, Boost.MPI will *not* work correctly in a multi-threaded context if you're sending serialized data over the wire. It's possible that two concurrent serialized receives on the same communicator and with the same tag could get confused, most likely causing a crash. The solution isn't trivial, but we know how to solve the problem as soon as we have the chance.
/tmp/ccbyPLBK.o: In function `void boost::mpi::detail::all_gather_impl<int>(boost::mpi::communicator const&, int const*, int, int*, mpl_::bool_<true>)':all_gather_test.cpp: (.gnu.linkonce.t._ZN5boost3mpi6detail15all_gather_implIiEEvRKNS0_12com municatorEPKT_iPS6_N4mpl_5bool_ILb1EEE[void boost::mpi::detail::all_gather_impl<int>(boost::mpi::communicator const&, int const*, int, int*, mpl_::bool_<true>)]+0x1c): undefined reference to `boost::mpi::communicator::operator int() const'
/tmp/ccbyPLBK.o: In function `void boost::mpi::detail::broadcast_impl<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > (boost::mpi::communicator const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, int, mpl_::bool_<false>)':all_gather_test.cpp: (.gnu.linkonce.t._ZN5boost3mpi6detail14broadcast_implISsEEvRKNS0_12com municatorEPT_iiN4mpl_5bool_ILb0EEE[void boost::mpi::detail::broadcast_impl<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > (boost::mpi::communicator const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, int, mpl_::bool_<false>)]+0x27): undefined reference to `boost::mpi::communicator::operator int() const' -------------------------------------
I "quadruple-checked" to see that the versions of mpicxx/libraries/ include directories all belong to the new installation of MPICH2 and not remainders of old OpenMPI installation. Thus, I don't think that is a binary incompatibility issue.
I hate to say it, but we might need a quintuple check :) The implicit conversion from "communicator" to "int" shows that we are certainly getting the MPICH2 headers, because MPI_Comm is just a typedef of int in MPICH2 (in OpenMPI, MPI_Comm is a pointer). However, I'm guessing that the archive that we're linking against was compiled against OpenMPI. When you rebuilt Boost.MPI, did you remove "bin.v2/libs/mpi"? If not, I suggest that you do so and then rebuild Boost.MPI. - Doug
participants (2)
-
Aydın Buluç
-
Douglas Gregor