[Serialization] (Commented) assertion when using types in multiple DLLs on Windows (Ticket #3934 and #4394)

[Serialization] (Commented) assertion when using types in multiple DLLs on Windows (Ticket #3934 and #4394) ----------------------------------- ## Problem: We were using boost serialization in a DLL that is "statically" loaded into our main application. We are using Boost 1.44 and the serialization library as a DLL. (boost_serialization-vc80-mt-gd-1_44.dll) Now, the serialized types that we are using in this DLL have also been pulled into the main application -- that is, the .h+.cpp files of these types are included and linked in the exe as well as in the DLL project. (But the types are used independently in the exe and the dll module.) In Boost 1.44 this leads to a runtime assertion in void_cast.cpp[1_44_0, line 230]. ## Investigation: Since this code is in the function called void_caster::recursive_register I *assume* that this is the point where the types registered for serialization are registered with the library. I have found that this runtime assert was commented out in release 1.45 and all subsequent versions: std::pair<void_cast_detail::set_type::const_iterator, bool> result; // comment this out for now. result = s.insert(this); //assert(result.second); *Apparently* this was done in response to trac ticket #4394[1], at least judging from the comment Robert Ramey added to the ticket:
For now I'm going to comment out the assert. When I have time, I'll likely re-enable it with a way to override it.
I can only assume that ticket #3934[2] refers to the same or a similar problem. ## Issue with the conclusion in #4394 Robert writes in his comment:
The problem is that the way the code is structured, you'll have multiple instances of some functions. The best way would be eliminate the source of the problem by structuring code like this: ...
and he proposes to move the `template<class Archive> void serialize` function of the classes from the header file into the cpp file. However, *our* code already does this, and we get this assertion regardless. The reason I suspect is that both the main module code as well as the DLL module code are calling into the recursive_register function to register the type, and since the code for the class is (identically) duplicated in the DLL as well as in the executable, obviously it is registered twice. ## Question: Am I correct? So, the first question obviously is whether I'm correct in my analysis. I.e., the `assert(result.second);`that is currently commented out can *never* work properly for types that live in two modules (DLLs) in the same process and Boost:serialization is used as DLL. ## Question: Any harm done? Secondly, is there any adverse effect of a type being registered twice with the serialization DLL? What will happen on unload? Is it possible the type is unloaded too early because of this setup? Note: What about accidentally identical type keys for actually different types in different DLLs? ## Question: Solutions? What is a proper solution? Should we use dllexport types, so that the type is only registered once per process? What else? Robert, I hope you can chime in on this. I'll try to add a link to this mail to the trac issue #4394. cheers, Martin [1] : https://svn.boost.org/trac/boost/ticket/4394 [2] : https://svn.boost.org/trac/boost/ticket/3934

Martin B. wrote:
[Serialization] (Commented) assertion when using types in multiple
Robert writes in his comment:
The problem is that the way the code is structured, you'll have multiple instances of some functions. The best way would be eliminate the source of the problem by structuring code like this: ...
and he proposes to move the `template<class Archive> void serialize` function of the classes from the header file into the cpp file.
However, *our* code already does this, and we get this assertion regardless. The reason I suspect is that both the main module code as well as the DLL module code are calling into the recursive_register function to register the type, and since the code for the class is (identically) duplicated in the DLL as well as in the executable, obviously it is registered twice.
You might want to investigate this a little more. The "registration" occurs before main(..) is called. The static object which does this is created as a side-effect of invoking serialization. So if registrations are occuring in multiple modules, I would say it's because invokations are occuring in multiple modules. It has been my believe that structuring the code as above would address this issue. As I said that is my belief - I feel that I could be proved wrong and perhaps you've done just that. or perhaps not. So I would be interested in seeing what you find when you get to the bottom of it. Here a couple of ideas you might try. Using the debugger, trap the code at the registration point and verify where each module is invoking the registration from. Also you can track the registration lookup to verify that the code isn't being called from some unexpected place. In a large program, it's my custom to create a local library of all my modules and link this code into my executable. This speeds up application compilation/link time and helps keep me from coupling modules in unintended ways. I suspect that many do this for the same reason. The problem is that when one does this, you might be linking the code from the library into both the DLL and the main module without realizing it. If you application is gigantic - which it might well be, this would be hard to find. So you might consider putting the serialization modules in a separate library so that this library ONLY linked when creating the DLL. As I said, I believe this assertion can be trapped only if the serialization code is included in multiple modules.
## Question: Am I correct?
lol - as far as anyone else is. We're sort of in uncharted territory.
So, the first question obviously is whether I'm correct in my analysis. I.e., the `assert(result.second);`that is currently commented out can *never* work properly for types that live in two modules (DLLs) in the same process and Boost:serialization is used as DLL.
I"m not sure what "never work" means. I would phrase this as: the assert will trap whenever serialization for the same type is include in more than one module.
## Question: Any harm done?
Secondly, is there any adverse effect of a type being registered twice with the serialization DLL? What will happen on unload? Is it possible the type is unloaded too early because of this setup?
Care has been taken to be sure that this will work as one would expect. That is, when a module is unloaded, the registry entry dropped is the same one that was created when the module was loaded. That is the intention at least. I don't know that it's every been explictly verified or tested.
Note: What about accidentally identical type keys for actually different types in different DLLs?
lol - I would expect that that would be disasterous. the exported type name string has to uniquely identify the type. The whole house of cards depends upon this. A more interesting question is: What if serialization code for the same is instantiated in multiple modules? I have taken care that that should create no problem. Again, I haven't explicitly tested this. On the other hand, it doesn't seem that anyone has reportd a problem since I commented out the assertion. for whatever that's worth. My motivation for including the assertion was the possibility that one might have different versions of the serialization code in different DLLS. That seemed to me to be a situation which would/could errors which would be almost impossible to debug. Hence the assertion and my advice to structure the code to avoid this kind of problem. Unfortunately, structuring the code in this way proved to be very burdensome to many developers. I would argue that not being able to structure one's code in this way is an indication that the design has some issue such as circular dependency or undesirable coupling between modules. But, I'm not here to argue - just to get the work done so here we are.
## Question: Solutions?
What is a proper solution? Should we use dllexport types, so that the type is only registered once per process? What else?
I would like to see you invest some more effort to see exactly where the extra registration comes from. I realize that this seems like "extra work" when all you (or your management) just want's a fix. But I really think this effort will be rewarded in tracking down an unanticipated dependcy or something like that. If that doesn't turn out to be the case, then we (I) will have a counter example to my belief that structuring can elminate this and other potential problems. You've made the effort to structure the code this way. You're 99% of the way to perfection. Don't give up now! Robert Ramey
Robert, I hope you can chime in on this. I'll try to add a link to this mail to the trac issue #4394.
cheers, Martin
[1] : https://svn.boost.org/trac/boost/ticket/4394 [2] : https://svn.boost.org/trac/boost/ticket/3934
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On 13.12.2011 18:22, Robert Ramey wrote: > Martin B. wrote: >> [Serialization] (Commented) assertion when using types in multiple > >> Robert writes in his comment: >> >>> The problem is that the way the code is structured, >>> you'll have multiple instances of some functions. >>> The best way would be eliminate the source of the >>> problem by structuring code like this: >>> ... >>> >> >> and he proposes to move the `template<class Archive> void serialize` >> function of the classes from the header file into the cpp file. >> >> However, *our* code already does this, and we get this assertion >> regardless. The reason I suspect is that both the main module code as >> well as the DLL module code are calling into the recursive_register >> function to register the type, and since the code for the class is >> (identically) duplicated in the DLL as well as in the executable, >> obviously it is registered twice. > > You might want to investigate this a little more. The "registration" > occurs before main(..) is called. The static object which does this > is created as a side-effect of invoking serialization. (...) The code that does this is invoked as a side effect of BOOST_CLASS_EXPORT_IMPLEMENT. Obviously, I must call this, as otherwise my code wouldn't work anymore when the serialization is actually done :-) > > As I said that is my belief - I feel that I could be proved wrong > and perhaps you've done just that. or perhaps not. So I would > be interested in seeing what you find when you get to the bottom > of it. Here a couple of ideas you might try. > > Using the debugger, trap the code at the registration point and > verify where each module is invoking the registration from. (...) Done this. Here are the call stacks from my test project: (sorry for the long lines - hop not too messed up) (1) This is from the DLL init ----------------------------- > boost_serialization-vc80-mt-gd-1_44.dll!boost::serialization::void_cast_detail::void_caster::recursive_register(bool includes_virtual_base=true) Line 229 BoostSerializeMainDll.dll!boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base>::void_caster_virtual_base<Derived,Base>() Line 238 + 0xd bytes BoostSerializeMainDll.dll!boost::serialization::detail::singleton_wrapper<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >::singleton_wrapper<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >() + 0x4a bytes BoostSerializeMainDll.dll!boost::serialization::singleton<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >::get_instance() Line 128 + 0x28 bytes BoostSerializeMainDll.dll!`dynamic initializer for 'boost::serialization::singleton<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >::instance''() Line 149 + 0x23 bytes msvcr80d.dll!_initterm BoostSerializeMainDll.dll!_CRT_INIT BoostSerializeMainDll.dll!__DllMainCRTStartup BoostSerializeMainDll.dll!_DllMainCRTStartup (2) this is the second place (with the assert) from the exe init ------------------------------------- > msvcr80d.dll!_wassert(const wchar_t * expr=0x003c3854, const wchar_t * filename=0x003c3808, unsigned int lineno=230) Line 370C boost_serialization-vc80-mt-gd-1_44.dll!boost::serialization::void_cast_detail::void_caster::recursive_register(bool includes_virtual_base=true) Line 230 + 0x1d bytesC++ BoostSerializeMainExe.exe!boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base>::void_caster_virtual_base<Derived,Base>() Line 238 + 0xd bytesC++ BoostSerializeMainExe.exe!boost::serialization::detail::singleton_wrapper<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >::singleton_wrapper<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >() + 0x4a bytesC++ BoostSerializeMainExe.exe!boost::serialization::singleton<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >::get_instance() Line 128 + 0x28 bytesC++ BoostSerializeMainExe.exe!`dynamic initializer for 'boost::serialization::singleton<boost::serialization::void_cast_detail::void_caster_virtual_base<Derived,Base> >::instance''() Line 149 + 0x23 bytesC++ msvcr80d.dll!_initterm BoostSerializeMainExe.exe!__tmainCRTStartup() BoostSerializeMainExe.exe!mainCRTStartup() > > In a large program, (...) > As I said, I believe this assertion can be trapped only if > the serialization code is included in multiple modules. > The types declared to be serialized *are* included in several modules. But they are used independently. >> ## Question: Am I correct? > > lol - as far as anyone else is. We're sort of in uncharted > territory. > >> So, the first question obviously is whether I'm correct in my >> analysis. >> I.e., the `assert(result.second);`that is currently commented out can >> *never* work properly for types that live in two modules (DLLs) in the >> same process and Boost:serialization is used as DLL. > > I"m not sure what "never work" means. I would phrase this as: > the assert will trap whenever serialization for the same type is include > in more than one module. > Yes, this captures it. So what would the solution be? Make sure BOOST_CLASS_EXPORT_IMPLEMENT is only used in one module? > >> ## Question: Any harm done? >> (...) > > Care has been taken to be sure that this will work as one would expect. > That is, when a module is unloaded, the registry entry dropped is the > same one that was created when the module was loaded. (...) OK, that would mean that when the DLL is unloaded, it *also* drops the registration for the type in the executable module, meaning no serialization for the type will work after the DLL has been unloaded, although the type itself is still there. > (...) > A more interesting question is: What if serialization code for the same > is instantiated in multiple modules? I have taken care that that > should create no problem. Again, I haven't explicitly tested this. > On the other hand, it doesn't seem that anyone has reportd a problem > since I commented out the assertion. for whatever that's worth. > My motivation for including the assertion was the possibility > that one might have different versions of the serialization > code in different DLLS. That seemed to me to be a situation > which would/could errors which would be almost impossible > to debug. Hence the assertion and my advice to structure > the code to avoid this kind of problem. (...) As far as I see this, the only way to avoid this problem would be for the types to be serialized (or at least their serialization code) to be put in a separate, "third" DLL, that could be used by DLLs and the exe module. This is overhead, but might be well worth a try. (Though I'm not sure how this would work well together with the helper macros and all the templates in Boost Serialization -- but I haven't tried.) > >> ## Question: Solutions? >> (...) > I would like to see you invest some more effort to see > exactly where the extra registration comes from. Well, see my call-stack above and I'll dump a sourcecode sketch below. This does reproduce the thing for me, but other that ignoring the assert (upgrading to boost 1.48) I haven't tried any approaches yet. cheers, Martin ps: Sourcecode (files separated by ++++++++) a) 2 Projects, one executable and one DLL that is "statically" used by the exe. EXE: Types.cpp main.cpp DLL: Types.cpp dll_main.cpp b) An abstract base class and a derived class that are used in the EXE as well as in the DLL, but are compiled and linked for both. c) The exe uses these types in serialization and the DLL as well. ++++++++ // Types.h #pragma warning( disable : 4250 ) #define BOOST_SERIALIZATION_DYN_LINK #include <boost/archive/text_oarchive.hpp> #include <boost/archive/text_iarchive.hpp> #include <boost/serialization/scoped_ptr.hpp> #include <boost/serialization/export.hpp> class Base { public: virtual ~Base() {} virtual void DoStuff() = 0; private: friend class boost::serialization::access; template<class Archive> void serialize(Archive & ar, const unsigned int file_version); }; BOOST_SERIALIZATION_ASSUME_ABSTRACT(Base); struct SData { boost::scoped_ptr<Base> obj; }; BOOST_CLASS_TRACKING(SData, boost::serialization::track_never); namespace boost { namespace serialization { template<class Archive> void serialize(Archive & ar, SData & data, const unsigned int version) { assert(version == 0); ar & data.obj; } } // namespace serialization } // namespace boost class Derived : virtual public Base { public: Derived(); virtual void DoStuff(); private: int m_number; private: friend class boost::serialization::access; template<class Archive> void serialize(Archive & ar, const unsigned int file_version); }; BOOST_CLASS_EXPORT_KEY(Derived); ++++++++ // Types.cpp #include "Types.h" template<class Archive> void Base::serialize(Archive & ar, const unsigned int file_version) { /*empty*/ } BOOST_CLASS_EXPORT_IMPLEMENT(Derived); Derived::Derived() : m_number(42) { } void Derived::DoStuff() { std::cout << static_cast<void*>(this) << " - Derived::DoStuff:\n" << " current value = " << m_number << "\n"; ++m_number; std::cout << " new value = " << m_number << "\n"; } template<class Archive> void Derived::serialize(Archive & ar, const unsigned int version) { ar & boost::serialization::base_object<Base>(*this); assert(version == 0); ar & m_number; } +++++++++ main.cpp (executable) #include ... #include "Types.h" int main() { SData d1; d1.obj.reset(new Derived()); d1.obj->DoStuff(); std::ostringstream out_buf; boost::archive::text_oarchive out_archive(out_buf); out_archive << d1; const std::string data = out_buf.str(); std::istringstream in_buf(data); boost::archive::text_iarchive in_archive(in_buf); SData d2; in_archive >> d2; d2.obj->DoStuff(); d1.obj->DoStuff(); } ++++++++ // dll_main.cpp #include ... #include "Types.h" void dll_fn_that_uses_serialization() { SData d1; d1.obj.reset(new Derived()); d1.obj->DoStuff(); std::ostringstream out_buf; boost::archive::text_oarchive out_archive(out_buf); out_archive << d1; const std::string data = out_buf.str(); std::istringstream in_buf(data); boost::archive::text_iarchive in_archive(in_buf); SData d2; in_archive >> d2; d2.obj->DoStuff(); d1.obj->DoStuff(); }

On 14.12.2011 09:03, Martin B. wrote:
On 13.12.2011 18:22, Robert Ramey wrote:
Martin B. wrote:
[Serialization] (Commented) assertion when using types in multiple (...) (...) ## Question: Solutions? (...) I would like to see you invest some more effort to see exactly where the extra registration comes from.
Well, see my call-stack above and I'll dump a sourcecode sketch below. This does reproduce the thing for me, but other that ignoring the assert (upgrading to boost 1.48) I haven't tried any approaches yet.
Okay. Now, for this my testproject, I have been able to make it "just work" without the assertion by: 1. Removing the Types.cpp file from the executable (this will obviously initially give an unresolved external warning for Derived::* members) 2.) Make Derived a DLL export class, that is: class DLL_API Derived : ... with the usual #ifdef DLL_EXPORTS #define DLL_API __declspec(dllexport) #else #define DLL_API __declspec(dllimport) #endif To my surprise it compiles, links, and runs correctly. (And the type is only registered once, namely from the DLL.) Now I'll just have to check if this solution can be applicable for our real projects, where we currently use BOOST_CLASS_EXPORT() instead of the split versions and we have a multitude of derived classes. So maybe the assertion made sense after all. (Though I think this should be configurable on a type by type basis should it ever be reenabled.) cheers, Martin p.s: Leaving sorcecode quote:
ps: Sourcecode (files separated by ++++++++)
a) 2 Projects, one executable and one DLL that is "statically" used by the exe.
EXE: Types.cpp main.cpp
DLL: Types.cpp dll_main.cpp
b) An abstract base class and a derived class that are used in the EXE as well as in the DLL, but are compiled and linked for both.
c) The exe uses these types in serialization and the DLL as well.
++++++++ // Types.h #pragma warning( disable : 4250 ) #define BOOST_SERIALIZATION_DYN_LINK #include <boost/archive/text_oarchive.hpp> #include <boost/archive/text_iarchive.hpp> #include <boost/serialization/scoped_ptr.hpp> #include <boost/serialization/export.hpp>
class Base { public: virtual ~Base() {}
virtual void DoStuff() = 0;
private: friend class boost::serialization::access;
template<class Archive> void serialize(Archive & ar, const unsigned int file_version); }; BOOST_SERIALIZATION_ASSUME_ABSTRACT(Base);
struct SData { boost::scoped_ptr<Base> obj; }; BOOST_CLASS_TRACKING(SData, boost::serialization::track_never);
namespace boost { namespace serialization {
template<class Archive> void serialize(Archive & ar, SData & data, const unsigned int version) { assert(version == 0); ar & data.obj; }
} // namespace serialization } // namespace boost
class Derived : virtual public Base { public: Derived();
virtual void DoStuff();
private: int m_number;
private: friend class boost::serialization::access;
template<class Archive> void serialize(Archive & ar, const unsigned int file_version); };
BOOST_CLASS_EXPORT_KEY(Derived);
++++++++ // Types.cpp #include "Types.h"
template<class Archive> void Base::serialize(Archive & ar, const unsigned int file_version) { /*empty*/ }
BOOST_CLASS_EXPORT_IMPLEMENT(Derived);
Derived::Derived() : m_number(42) { }
void Derived::DoStuff() { std::cout << static_cast<void*>(this) << " - Derived::DoStuff:\n" << " current value = " << m_number << "\n"; ++m_number; std::cout << " new value = " << m_number << "\n"; }
template<class Archive> void Derived::serialize(Archive & ar, const unsigned int version) { ar & boost::serialization::base_object<Base>(*this);
assert(version == 0); ar & m_number; }
+++++++++ main.cpp (executable) #include ... #include "Types.h"
int main() { SData d1; d1.obj.reset(new Derived()); d1.obj->DoStuff();
std::ostringstream out_buf; boost::archive::text_oarchive out_archive(out_buf); out_archive << d1; const std::string data = out_buf.str();
std::istringstream in_buf(data); boost::archive::text_iarchive in_archive(in_buf); SData d2; in_archive >> d2;
d2.obj->DoStuff(); d1.obj->DoStuff(); }
++++++++ // dll_main.cpp #include ... #include "Types.h"
void dll_fn_that_uses_serialization() { SData d1; d1.obj.reset(new Derived()); d1.obj->DoStuff();
std::ostringstream out_buf; boost::archive::text_oarchive out_archive(out_buf); out_archive << d1; const std::string data = out_buf.str();
std::istringstream in_buf(data); boost::archive::text_iarchive in_archive(in_buf); SData d2; in_archive >> d2;
d2.obj->DoStuff(); d1.obj->DoStuff(); }

On 14.12.2011 09:19, Martin B. wrote:
On 14.12.2011 09:03, Martin B. wrote:
On 13.12.2011 18:22, Robert Ramey wrote:
Martin B. wrote:
[Serialization] (Commented) assertion when using types in multiple (...) (...) ## Question: Solutions? (...) I would like to see you invest some more effort to see exactly where the extra registration comes from.
Well, see my call-stack above and I'll dump a sourcecode sketch below. This does reproduce the thing for me, but other that ignoring the assert (upgrading to boost 1.48) I haven't tried any approaches yet.
Okay. Now, for this my testproject, I have been able to make it "just work" without the assertion by:
1. Removing the Types.cpp file from the executable (this will obviously initially give an unresolved external warning for Derived::* members)
2.) Make Derived a DLL export class, that is: class DLL_API Derived : ... with the usual (...)
To my surprise it compiles, links, and runs correctly. (And the type is only registered once, namely from the DLL.)
Now I'll just have to check if this solution can be applicable for our real projects, where we currently use BOOST_CLASS_EXPORT() instead of the split versions and we have a multitude of derived classes.
So maybe the assertion made sense after all. (Though I think this should be configurable on a type by type basis should it ever be reenabled.)
OK, we have now adapted our real project types and it seems it's working out well. So, to summarize: * The notification *makes* kind of sense, although it really isn't an error[1] but a nice warning and so the *unconditional* assert is rather inappropriate I think. * I guess the situation as it stands with the disabled assert is better that the situation before with the unconditionally enabled assert. For future improvements, I think that the BOOST_CLASS_EXPORT_* macros need to provide a way to specify the desired checking level on a case by case basis. (And the appropriate documentation for the firing assertion.) * Making these Types DLL-exported and defining their implementation only in the DLL worked for us, and in general, should probably work for most types where this problem occurs, because in this context you already have the situation of multiple DLLs+EXE, and you're just moving the implementation of the type to a single place. * [Speculation on my part:] If the types live in a static lib that is used by different modules of the same process, things do get more complicated, as it then should be more difficult to DLL-export the types. But maybe one could work around this by not using the serialization lib as a DLL but instead as a static lib. (Obviously would lead to code bloat, but you're already living with duplicated implementation code for the types in each module in this case anyway.) [1]: Wrt. "not an error" - The assertion was harmless for our code, but the situation could (I have not tested it) be potentially disastrous for dynamically loaded and unloaded DLLs that use the same types as another module in the process. If I understood correctly, if the serialization registration is done twice for a type, the first DLL to use it and being unloaded will unregister the type, thus breaking serialization for this type for all other modules. I haven't reproduced this, so it's largely hypothetical at this point (are there any open tickets wrt. this?). I'm also still unsure if only those types that use BOOST_CLASS_EXPORT are affected by this problem. cheers, Martin
participants (2)
-
Martin B.
-
Robert Ramey