[shared_ptr] A smarter smart pointer proposal for dynamic libraries

I have a suggestion for the shared pointer implementation to overcome a severe limitation it has currently regarding dynamically unloaded libraries. I posted a little while back regarding this problem which is illustrated with the following example from a Microsoft Windows environment. I have a simple factory interface like so: class Factory { public: virtual shared_ptr<int> GetIntSharedPtr() = 0; virtual int * GetIntRawPtr() = 0; }; I implement a factory DLL to provide one of these factories that has an exported function GetFactory() like so: /// implementation of the Factory interface class TestFactory : public Factory { public: virtual shared_ptr<int> GetIntSharedPtr() { return shared_ptr<int>( new int(5) ); } virtual int * GetIntRawPtr() { return new int(0); } }; /// Library instance of the factory implementation TestFactory g_Factory; /// Exported function to return the factory for this library _declspec(dllexport) extern "C" Factory & GetFactory() { return g_Factory; } This is compiled into a library named "TestFactory.dll". Now I create a simple command line application that does the following: /// Typedefinition for for the GetFactory exported function typedef Factory & (_cdecl * GetFactory)(); int main() { // dynamically load the TestFactory library HMODULE hFactoryDll = ::LoadLibrary( "TestFactory.dll" ); // get the GetFactory() interface from the loaded library GetFactory pfnGetFactory = ::GetProcAddress( hFactoryDll, "GetFactory" ); // Acquire the factory object Factory & factory = pfnGetFactory(); // Call the factory interfaces to get dynamically allocated integers int * pRawInt = factory.GetIntRawPtr(); shared_ptr<int> spInt = factory.GetIntSharedPtr(); // everything is fine so far, now unload the factory library ::FreeLibrary( hFactoryDll ); // deallocating the raw pointer recieved from the factory works just fine delete pRawInt; // THIS WORKS! // However due to the reasons outlined below, releasing the shared_ptr causes // an access violation spInt.reset() // THIS CRASHES! } The basic problem is that the shared_ptr implementation uses virtual functions in the sp_counted_base class that gets generated in the factory dll since it is a template instantiation and once that factory dll is unloaded the virtual function tables for all shared_ptrs it generated are now garbage. In my opinion the virtual function implementation in the template is an unnecessary design flaw that creates this limitation. I also think the reference counter just simply does too much. I would propose an alternative design that eliminated the use of virtual functions in template generated base classes and let the outer smart pointer class handle the deallocation chores and make the reference counter just a simple reference counter. This could eliminate the limitation illustrated in this simple and very common use case. Ignoring a bunch of stuff right now like custom deallocators, weak references, some basic members like assignment, etc., here is a simple reference counted smart pointer that is able to handle this library unloading example without a problem just as you could do with a raw pointer: /// reference counter THAT ONLY REFERENCE COUNTS AND HAS NO VTABLE class SimpleRefCounter { public: /// Constructor initializes reference count to 0 SimpleRefCounter( long count = 0 ) : m_Count(count) {} /// Increments the reference count and returns it long Increment() { return ::InterlockedIncrement( &m_Count ); } /// Decrements the reference count and returns it long Decrement() { return ::InterlockedDecrement( &m_Count ); } /// Returns the current reference count long Count() const { return m_Count; } private: long m_Count; }; /// Simple smart pointer template that handles the deallocation chores and therefor /// can be used safely across dynamic dll boundaries template < typename _type > class SmarterPtr { public: /// Default constructor creates a null pointer SmarterPtr() : m_pValue( NULL ), m_pRefCounter( NULL ) {} /// Constructor taking an allocated value to manage SmarterPtr( _type * p_pValue ) : m_pValue( p_pValue ) { m_pRefCounter = new SimpleRefCounter( 1 ); } /// Copy constructor SmarterPtr( const SmarterPtr & p_Original ) : m_pValue( p_Original.m_pValue ), m_pRefCounter( p_Original.m_pRefCounter ) { m_pRefCounter->Increment(); } /// Destructor releases ~SmarterPtr() { reset(); } /// Resets the managed pointer to the passed in one releasing the /// currently managed pointer and deallocating if necessary void reset( _type * pNewValue = NULL ) { if ( pNewValue != m_pValue ) { if ( m_pValue && 0 == m_pRefCounter->Decrement() ) { // need to deallocate current value delete m_pValue; delete m_pRefCounter; // ignore weak ref for now m_pValue = NULL; m_pRefCounter = NULL; } if ( pNewValue ) { m_pRefCounter = new SimpleRefCounter( 1 ); m_pValue = pNewValue; } } } private: _type * m_pValue; SimpleRefCounter * m_pRefCounter; }; Again the key is no VTABLE and deallocation handled by the exposed templates. Why shouldn't shared_ptr move to an equivalent model so it can be used across dynamic dll boundaries safely? J.D.

As far as I can tell, the only reason your reset() appears to work is that it is inlined in the executable. The compiler doesn't guarantee inlining, so in general it will crash anyway. The solution to this problem is to inject the DLL's lifetime into any shared_ptrs obtained from that DLL; that is, make the shared_ptr keep the DLL afloat. This can be done easily (and non-intrusively) using shared_ptr's aliasing constructor. Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Yes that's the point, its inlined wherever the shared pointer template is instantiated for the type. Being a template it has to work that way. Since that code is the same whether its instantiated in the unloaded dll or the executable, everything works. Your solution is fine except it demands an extra burden on the architecture that in my opinion is unnecessary. The shared_ptr should behave as much like a raw pointer as possible and this would be another step closer to that. Also, we have use cases that need to unload factories but what they produced could still be floating around the system. J.D. On Tue, Dec 23, 2008 at 9:37 PM, Emil Dotchevski <emildotchevski@gmail.com>wrote:
As far as I can tell, the only reason your reset() appears to work is that it is inlined in the executable. The compiler doesn't guarantee inlining, so in general it will crash anyway.
The solution to this problem is to inject the DLL's lifetime into any shared_ptrs obtained from that DLL; that is, make the shared_ptr keep the DLL afloat. This can be done easily (and non-intrusively) using shared_ptr's aliasing constructor.
Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

On Tue, Dec 23, 2008 at 11:27 PM, J.D. Herron <jotadehace@gmail.com> wrote:
On Tue, Dec 23, 2008 at 9:37 PM, Emil Dotchevski <emildotchevski@gmail.com>wrote:
As far as I can tell, the only reason your reset() appears to work is that it is inlined in the executable. The compiler doesn't guarantee inlining, so in general it will crash anyway.
The solution to this problem is to inject the DLL's lifetime into any shared_ptrs obtained from that DLL; that is, make the shared_ptr keep the DLL afloat. This can be done easily (and non-intrusively) using shared_ptr's aliasing constructor.
Yes that's the point, its inlined wherever the shared pointer template is instantiated for the type. Being a template it has to work that way. Since that code is the same whether its instantiated in the unloaded dll or the executable, everything works.
The problem is that the code that maintains the refcount and releases the object isn't a template, and so it may or may not be inlined -- that's entirely up to the compiler.
Your solution is fine except it demands an extra burden on the architecture
It places the burden of returning a shared_ptr with correct lifetime on the factory itself, which is another way of saying that the factory needs to know that it deals with an unloadable module. Can this be avoided? The user code only deals with shared_ptr<foo> and is insulated from knowing anything about DLLs. Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

on Tue Dec 23 2008, "J.D. Herron" <jotadehace-AT-gmail.com> wrote:
Yes that's the point, its inlined wherever the shared pointer template is instantiated for the type. Being a template it has to work that way.
Actually not. Some compilers do link-time instantiation. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Yes that's the point, its inlined wherever the shared pointer template is instantiated for the type. Being a template it has to work that way.
Actually not. Some compilers do link-time instantiation.
My reply was poorly worded. Of course the compiler may choose to generate function calls for an instantiated type and could do so for both the executable and dynamically loaded dll in my simple example. Fine. My point is that even if that were the case it doesn't hurt my simple smart pointer alternative because the function pointers that might exist in the factory are not carried over into the executable and therefore would pose no problem The only way function pointers could be carried over would be via a VTABLE which does not exist here. So, if the compiler chose to create a function call rather that straight inlining it still works because in the exe it calls the exe created function which does the right thing. There is no call back into the dynamic dll function as would be the case for a generated VTABLE. I will have to think some more on the claims that there is no way to support custom allocators and deleters without a VTABLE. I agree completely that these are essential features. Need to learn some more on how exactly these are implemented and why a VTABLE would be essential.

On Thu, Dec 25, 2008 at 3:06 AM, J.D. Herron <jotadehace@gmail.com> wrote:
My reply was poorly worded. Of course the compiler may choose to generate function calls for an instantiated type and could do so for both the executable and dynamically loaded dll in my simple example. Fine. My point is that even if that were the case it doesn't hurt my simple smart pointer alternative because the function pointers that might exist in the factory are not carried over into the executable and therefore would pose no problem
In general, the caller needs to get the function pointers, or vtables, or boost::function objects, or whatever else I might have stuffed in a custom deleter, not to mention ("a pointer to") the destructor of the pointee. Thus, the DLL must remain loaded. (Note that even if the factory doesn't return a shared_ptr with custom deleter, the fact that it returns a shared_ptr means that it reserves the right to do so occasionally or in a future implementation.) Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

J.D. Herron wrote:
I have a suggestion for the shared pointer implementation to overcome a severe limitation it has currently regarding dynamically unloaded libraries.
[snip]
The basic problem is that the shared_ptr implementation uses virtual functions in the sp_counted_base class that gets generated in the factory dll since it is a template instantiation and once that factory dll is unloaded the virtual function tables for all shared_ptrs it generated are now garbage. In my opinion the virtual function implementation in the template is an unnecessary design flaw that creates this limitation. I also think the reference counter just simply does too much. I would propose an alternative design that eliminated the use of virtual functions in template generated base classes and let the outer smart pointer class handle the deallocation chores and make the reference counter just a simple reference counter. This could eliminate the limitation illustrated in this simple and very common use case. Ignoring a bunch of stuff right now like custom deallocators, weak references, some basic members like assignment, etc., here is a simple reference counted smart pointer that is able to handle this library unloading example without a problem just as you could do with a raw pointer:
[snip]
Again the key is no VTABLE and deallocation handled by the exposed templates. Why shouldn't shared_ptr move to an equivalent model so it can be used across dynamic dll boundaries safely?
You answered this question yourself - this approach ignores custom allocators and deleters, which cannot be supported without virtual functions or equivalent. I would not trade these features for better support for DLLs and I'm sure there are others who share this opinion. So, I don't think that chopping away current (and widely used) functionality for sake of this improvement is the right direction to go. There are different solutions for your problem, some of them were shown by others (like controlling DLL life time with shared_ptrs). For example, you can use intrusive_ptr for reference counting without virtual functions and, in fact, additional memory allocation. Or, if you don't like the intrusive approach, you can construct shared_ptrs in the inline functions in the factory interface: struct Factory { shared_ptr< Object > Create() { return shared_ptr< Object >(CreateImpl()); } private: virtual Object* CreateImpl() = 0; }; But this will only save you if the calling DLL stays loaded (the factory DLL need not to).

J.D. Herron:
I have a suggestion for the shared pointer implementation to overcome a severe limitation it has currently regarding dynamically unloaded libraries. I posted a little while back regarding this problem which is illustrated with the following example from a Microsoft Windows environment. I have a simple factory interface like so:
class Factory { public: virtual shared_ptr<int> GetIntSharedPtr() = 0; virtual int * GetIntRawPtr() = 0; };
... It is indeed a limitation of shared_ptr that it doesn't work when its creator code is unloaded. In this respect it is similar to the following raw pointer code: class X { ~X(); // ... }; class XFactory { public: virtual X * GetRawPtr() = 0; }; in which case ~X resides in the DLL. The advantage of the current shared_ptr code is that both the creation and the destruction of the pointee happen at the same place - inside the DLL. This allows it to support DLLs which do not use the same operator new/delete/malloc/free as the main executable. The general way to solve this problem is as Emil suggests: extend the lifetime of the DLL so that it isn't unloaded while its code is needed. If you just need support for ints/PODs and can guarantee that the EXE and the DLL agree on operator new/delete, you ought to return an auto_ptr<> from the factory. You can assign the auto_ptr to a shared_ptr on the client side and it will be deleted properly. Returning an auto_ptr (or, in the new C++, an unique_ptr) is a way to state to the caller that the pointee can be delete'd from its side - which is what you want. A shared_ptr return implies that the creator retains the right to dispose of the pointee and the exact way to do so is unspecified (it can even retain its own shared_ptr or weak_ptr to it for some reason if the program logic requires). This in turn implies that the code of the creator must remain in memory. :-) -- Peter Dimov http://www.pdplayer.com
participants (5)
-
Andrey Semashev
-
David Abrahams
-
Emil Dotchevski
-
J.D. Herron
-
Peter Dimov