shared_ptr in multithreaded environments

I see that shared_ptr is quite costly in multithreaded environments cause it would need DCAS, and thus uses a spinlock in cases where it is not available. intrusive_ptr, which is only one word, does not need that. Shouldn't intrusive_ptr be put forward more, then? Or better, why not make a shared_ptr-like shared_obj, a container, responsible for allocating the object, thus able to allocate a struct { T value; int refcount; } instead of T's, and advocate its usage? Such an interface could also trivially use a good garbage collection algorithm instead of refcounting. It seems to me that shared_ptr was made inefficient by design, which is not a really good thing since it is getting more and more popular. Could someone confirm or infirm this?

On Mon, Mar 10, 2008 at 3:18 PM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
I see that shared_ptr is quite costly in multithreaded environments cause it would need DCAS, and thus uses a spinlock in cases where it is not available.
If you pass shared_ptr objects by const &, the refcount doesn't need to be updated. Move semantics would make things like inserting into std::vector<shared_ptr<foo> > not update the refcount too. Until then, in most cases you can use std::deque<shared_ptr<foo> > instead. Do you have specific use case which demonstrates that shared_ptr's refcounting is the bottleneck? If you do, you can post it here for discussion and/or a possible fix. -- Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

On Mar 10, 2008, at 6:27 PM, Emil Dotchevski wrote:
On Mon, Mar 10, 2008 at 3:18 PM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
I see that shared_ptr is quite costly in multithreaded environments cause it would need DCAS, and thus uses a spinlock in cases where it is not available.
If you pass shared_ptr objects by const &, the refcount doesn't need to be updated. ... [snip] -- Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode
Yes, but be careful that the lifetime of the shared_ptr object passed as const & must be guaranteed to outlive the const &, otherwise you have a dangling reference. That's guaranteed if the caller owns a copy of the shared_ptr, but not if it passes a reference to, e.g., a shared_ptr owned by an external data structure (e.g. a queue of shared_ptr objects), that can be modified by several threads or even by the current thread. I've been bitten by that (shared_ptr of callbacks, you dequeue the callback while another thread is still processing it, and it just happens to destroy all the bound parameters in the callback... oops!) In one pernicious instance, it's the callback itself that was flushing the queue and destroyed the shared_ptr that it had a const& to... -- Hervé Brönnimann hervebronnimann@mac.com

On Mar 10, 2008, at 6:27 PM, Emil Dotchevski wrote:
On Mon, Mar 10, 2008 at 3:18 PM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
I see that shared_ptr is quite costly in multithreaded environments cause it would need DCAS, and thus uses a spinlock in cases where it is not available.
If you pass shared_ptr objects by const &, the refcount doesn't need to be updated. ... [snip]
Yes, but be careful that the lifetime of the shared_ptr object passed as const & must be guaranteed to outlive the const &, otherwise you have a dangling reference.
This is true, but it is also true for any object you pass by reference. It is even true for the this pointer in a member function. Regardless, in general, objects of user-defined types are best passed by const reference; the same is true for shared_ptr. -- Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

On Mon, Mar 10, 2008 at 6:18 PM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
It seems to me that shared_ptr was made inefficient by design, which is not a really good thing since it is getting more and more popular.
Could someone confirm or infirm this?
shared_ptr is the only one with the oft-undervalued weak_ptr, and let's you say "here, use this" with no additional changes. It's not excessively inefficient, so will usually be "good enough", and for those people that need something else, there are other options available. Also, "As a general rule, if it isn't obvious whether intrusive_ptr better fits your needs than shared_ptr, try a shared_ptr-based design first" (http://boost.org/libs/smart_ptr/intrusive_ptr.html).

Peter Dimov wrote
Mathias Gaunard:
I see that shared_ptr is quite costly in multithreaded environments cause it would need DCAS, and thus uses a spinlock in cases where it is not available.
Where do you see that? It's not true.
AFAIK, the current shared_ptr is not lock-free at all. Discussion about making it lock-free said it required to use DWCAS (cmpxchg16b on x86-64). It seems, however, this wasn't implemented.

Mathias Gaunard:
Peter Dimov wrote
Mathias Gaunard:
I see that shared_ptr is quite costly in multithreaded environments cause it would need DCAS, and thus uses a spinlock in cases where it is not available.
Where do you see that? It's not true.
AFAIK, the current shared_ptr is not lock-free at all.
It is lock-free on most platforms. Look at the code, particularly boost/detail/sp_counted_base*.hpp.
participants (5)
-
Emil Dotchevski
-
Hervé Brönnimann
-
Mathias Gaunard
-
Peter Dimov
-
Scott McMurray