
On Thursday 22 February 2007 19:06 pm, Timmo Stange wrote:
I think we should perhaps first decide on the interface and behavior before we discuss an implementation. That cleanup callback should be optional, without parameters and return value, right? I also think it should be called synchronously directly from disconnect if the slot is not running - otherwise from the executing thread directly after the slot returns.
Yes. I suppose there is the larger issue of whether we should bother at all. The scheme I described could be implemented by the users on their own, by adding their own bogus shared_ptr with custom deleter to the list of tracked objects.
I don't directly see the necessity for a shared_ptr there and I have to express my performance worries again. As nice as shared_ptr's thread-safety is, it doesn't come for free. If you store the callback in a Boost.Function object maintained by a shared pointer, we're talking about a minimum of three heap allocations (2 for the function and 1 for the reference count) - with most allocator implementations, those allocations imply process-global synchro- nization. The smart pointer itself (including all temporary copies and the creation from a weak_ptr) uses atomic reference counting. Both are scalability issues (mostly the synchronization) and exceptionally heavy in simple absolute runtime cost (as much as 50 times the cost of a simple object copy and more, depending on the CPU and system).
I really hate to play the optimization freak here, but we should keep an eye on those things, because if the abstraction the library provides comes at a too high runtime cost, the usability is severely limited. I usually avoid heap allocations in library code and if I need to have them, I try to provide an optional allocator template.
I'm not entirely unsympathetic. When I was benchmarking thread_safe_signals, I found dynamic allocation, mutex locking, and copying shared_ptrs to all be noticeable contributors. They would increase average overhead on the order of 100 nanoseconds each. What surprised me was when I tried using fast_pool_allocator to speed up allocation of scoped_locks, it didn't help. I assume it was because fast_pool_allocator had to lock a mutex to maintain thread safety.
Could you give a short example of how that would look with using bind and a simple free function expecting a pointer or reference to an object that should be tracked?
Attached. -- Frank