[Serialization] Segfault while serializing derived pointers using multi DLLs
Hi all (and particularly Boost/(De-)Serializer), I use Boost 1.44, gcc 4.4.1, Linux. The problem: I have two home made libraries compiled as DLL under linux: - 'datatools' provides the 'libdatatools.so' DLL - 'brio' provides the 'libbrio.so' DLL I use Boost serialization features for derived pointers. Below are the details: STEP 1: 'datatools' is the base library. It defines: - its own namespace: 'datatools' - a virtual class (interface) named 'i_serializable' from which all other serializable classes should inherit in order to benefit of the (de)serialization mechanism through pointer to this base class. [see http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/serialization.ht...] - some concrete classes (A, B, C) that inherit from the 'i_serializable' interface and register themselves using the export key features described in http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/special.html#exp... Typical inheritance diagram looks like: <pre> datatools::i_serializable | +--------------+--------------+ | | | datatools::A datatools::B datatools::C | datatools::A' </pre> Here is the typical model of the 'A.hpp' header file for the A class: <pre> ... #include <datatools/serialization/archives_list.hpp> // include Boost/Serialization text/XML/binary archives #include <datatools/serialization/i_serializable.hpp> // include the abstract mother interface class ... namespace datatools { class A: public i_serializable { blah-blah.. // no inline code (from http://www.boost.org/doc/libs/1_46_1/libs/serialization/doc/special.html#dll...) template<class Archive> void serialize (Archive & ar, const unsigned int version); }; } // register the class with a specific GUID: BOOST_CLASS_EXPORT_KEY2 (datatools::A, "datatools::A"); </pre> Here is the model of the 'A.cpp' implementation file: <pre> namespace datatools { ... template<class Archive> void A::serialize (Archive & ar, const unsigned int version) { ar & boost::serialization::make_nvp( "datatools__serialization__i_serializable", boost::serialization::base_object<datatools::serialization::i_serializable
(*this) ); ar & more data (with NVP stuff)...; }
} // end of namespace datatools BOOST_CLASS_EXPORT_IMPLEMENT(datatools::A) // explicit instantiation for all kind of known archives: #include <datatools/serialization/archives_list.hpp> // include the known text/XML/binary archives template void datatools::A::serialize(boost::archive::text_oarchive & ar, const unsigned int version); template void datatools::A::serialize(boost::archive::text_iarchive & ar, const unsigned int version); template void datatools::A::serialize(boost::archive::xml_oarchive & ar, const unsigned int version); ... more... </pre> Finaly, I can compile all this stuff using gcc and build the 'libdatatools.so' library which I prepend to my LD_LIBRARY_PATH. Everything looks fine. A test program 'prg1.cpp' that links against only 'libdatatools.so' and 'libboost_serialization.so' works prefectly, serialiazing and deserialiazing any collection of pointers to A,B, or C classes without problem. A must ! Thanks to Robert for that magic ! At this point, everything looks (is?) fine. Note that I have followed (in principle) all the guidelines provided by Robert. STEP 2: Now let's consider the actual problem ! As said before, my 'datatools' library is the base of some modular project with some other libraries that depend on 'datatools' (and Boost/Serialization). The 'brio' library is such a beast: <pre> Boost/Serialization | datatools | brio </pre> It has its own namespace: 'brio' It provides a few other dedicated classes, inherited from the 'datatools::i_serializable' abstract class and which are serializable via Boost. Let's consider the serializable 'brio::D' class, designed on the model of 'datatools::A' and using the same implementation recommendations. I have followed the guidelines use for the 'datatools::A' class to write both 'D.hpp' and 'D.cpp' files. Now the inheritance scheme is: <pre> datatools::i_serializable : | : +--------------+--------------+------------+-----+- - - - | | | : | datatools::A datatools::B datatools::C : brio::D | : datatools::A' : libdatatools.so scope : libbrio.so scope : </pre> I can compile the 'libbrio.so' DLL without any problem. Now I want to run a sample program 'prg2.cpp' that performs some (de)serialization operations on a collection of pointers to 'datatools::A', 'datatools::B', 'datatools::C' AND 'brio::D' instances. This program is linked against the following libraries (among others): - libbrio.so - libdatatools.so - libboost_serialization.so Well, it compiles perfectly. Note this program links against third-party libraries too, among them some are explicitely using 'dlopen' and 'dlclose' to satisfy internal and critical features that are out of my scope. I have no idea if this can have side-effect. However, when I run it, I observed the following behaviour: - all (de)serialization operations are done properly and I get files with embeded (text/XML...) portable archives than can be reloaded without problem. - at the END of the program, while some cleaning code is invoked (some kind of deep buried code out of my skills and understanding), I get a Segmentation fault. Here is a dump of the GDB backtrace: <pre> Program received signal SIGSEGV, Segmentation fault. 0x02647c78 in boost::serialization::typeid_system::extended_type_info_typeid_0::is_less_than(boost::serialization::extended_type_info const&) const () from /scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0 (gdb) bt #0 0x02647c78 in boost::serialization::typeid_system::extended_type_info_typeid_0::is_less_than(boost::serialization::extended_type_info const&) const () from /scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0 #1 0x0264737b in boost::serialization::extended_type_info::operator<(boost::serialization::extended_type_info const&) const () from /scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0 #2 0x0264dcac in boost::serialization::void_cast_detail::void_caster::operator<(boost::serialization::void_cast_detail::void_caster const&) const () from /scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0 #3 0x0264e56d in boost::serialization::void_cast_detail::void_caster::recursive_unregister() const () from /scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0 #4 0x0264ed8d in boost::serialization::void_cast_detail::void_caster_shortcut::~void_caster_shortcut() () from /scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0 #5 0x0264e5ee in boost::serialization::void_cast_detail::void_caster::recursive_unregister() const () from /scratch/sw/boost/install-1_44_0-Linux-i686-gcc44/lib/libboost_serialization.so.1.44.0 #6 0x024a6da7 in boost::serialization::void_cast_detail::void_caster_primitive<datatools::test::more_data_t, datatools::test::data_t>::~void_caster_primitive() () from /home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so #7 0x024a6f04 in boost::serialization::detail::singleton_wrapper<boost::serialization::void_cast_detail::void_caster_primitive<datatools::test::more_data_t, datatools::test::data_t> >::~singleton_wrapper() () from /home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so #8 0x0298c428 in __cxa_finalize (d=0x2609830) at cxa_finalize.c:56 #9 0x02426f04 in __do_global_dtors_aux () from /home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so #10 0x02534100 in _fini () from /home/mauger/Private/Work/lpc_nemo_svn/sw/datatools/datatools_trunk/Linux-i686/lib/libdatatools.so #11 0x0011dee6 in _dl_fini () at dl-fini.c:248 #12 0x0298c05f in __run_exit_handlers (status=0, listp=0x2a9e304, run_list_atexit=true) at exit.c:78 #13 0x0298c0cf in *__GI_exit (status=0) at exit.c:100 #14 0x02973b5e in __libc_start_main (main=0x8059495 <main>, argc=1, ubp_av=0xbfffcfd4, init=0x8061c40 <__libc_csu_init>, fini=0x8061c30 <__libc_csu_fini>, rtld_fini=0x11dcc0 <_dl_fini>, stack_end=0xbfffcfcc) at libc-start.c:252 #15 0x080592c1 in _start () at ../sysdeps/i386/elf/start.S:119 </pre> If one ignores the nasty details from this stack (local pathes and names), one observe that the problem seems to be related to some unregistration of some Boost/Serialization material. It occurs while the executable is trying to destruct some singleton_wrapper template class that manages some serializable classes from the 'datatools' library: - class 'datatools::test::more_data_t' (call it A') - and its mother class 'datatools::test::data_t', (call it A) inherited from 'datatools::i_serializable'. I expect such singleton is a static instance attached in some DLL. Am I wrong ? If not, which DLL is concerned 'libdatatools.so', 'libbrio.so' ? My feeling is that I have a problem with some arbitrary order of library unloading and messy unregistration that comes with. Unless there is a specific order to aggregate module within in DLL (A.o B.o A'.o...). Unfortunately, my skills are too limited to make a better idea and find a solution. There is some comments by Robert concerning such possible problems, but I'm not sure it makes sense in my case. So I will really appreciate if someone could advise me and possibly give me some hints. Thanks a lot for attention and help. Apologize for this rather long and technical issue. Regards frc -- François Mauger Groupe "Interactions Fondamentales et Nature du Neutrino" NEMO-3/SuperNEMO Collaboration LPC Caen-CNRS/IN2P3-UCBN-ENSICAEN Département de Physique -- Université de Caen Basse-Normandie Adresse/address: Laboratoire de Physique Corpusculaire de Caen (UMR 6534) ENSICAEN 6, Boulevard du Marechal Juin 14050 CAEN Cedex FRANCE Courriel/e-mail: mauger@lpccaen.in2p3.fr Tél./phone: 02 31 45 25 12 / (+33) 2 31 45 25 12 Fax: 02 31 45 25 49 / (+33) 2 31 45 25 49
participants (1)
-
François Mauger