[smart_ptr] Getting segfaults with weak_ptr<T>::lock()
I've run into a problem where a weak_ptr<> is segfaulting when I try
to do a lock() call on it. It doesn't happen all the time, and I'm at
a loss to see what's causing the problem.
I have an implementation of the Observer pattern in my code using
shared_ptr/weak_ptr:
class Observer {
public:
virtual void update(shared_ptr<Observed> d) = 0;
...
}
class Observed : public enable_shared_from_this<Observed> {
vector
Buddha Buck wrote: ...
My solution was:
template <typename T> class wxObserver : public Observer { private: T* p; wxObserver(); public: wxObserver(T* p) : p(p); void update(shared_ptr<Observed> o) { if (p) p->update(o); } }
class MyWidget : public wxWidget { private: shared_ptr
observer; public: MyWidget(...) : wxWidget(...), observer(this) { ... observed->register(observer); ... } ... } The idea is that when a MyWidget is created, it creates a shared pointer to a wxObserver and can register that wxObserver with observed objects without a problem. When wxWidgets chooses to delete the MyWidget, the shared pointer the MyWidget goes away, the wxObserver is deleted, and the weak pointers held by the observed objects expire. The implementation of Observed::notifyObservers() checks to make sure that the observers still exist before update()ing them, and only the live observers are updated.
The problem is that sometimes, after wxWidgets destroys a MyWidget, calls to Observed::notifyObservers() crashes with a segfault within the call to i->lock().
I think I see one problem in your code, but it cannot explain a segfault within i->lock(). When MyWidget is destroyed, if at the exact same time an Observed locks its weak_ptr to the wxObserver, the T* p pointer will dangle. Can you post a stack trace? The problem you describe sounds like a race between notifyObservers and ~Observed or register, is that possible?
On 7/10/07, Peter Dimov
Buddha Buck wrote: ...
My solution was:
template <typename T> class wxObserver : public Observer { private: T* p; wxObserver(); public: wxObserver(T* p) : p(p); void update(shared_ptr<Observed> o) { if (p) p->update(o); } }
class MyWidget : public wxWidget { private: shared_ptr
observer; public: MyWidget(...) : wxWidget(...), observer(this) { ... observed->register(observer); ... } ... } The idea is that when a MyWidget is created, it creates a shared pointer to a wxObserver and can register that wxObserver with observed objects without a problem. When wxWidgets chooses to delete the MyWidget, the shared pointer the MyWidget goes away, the wxObserver is deleted, and the weak pointers held by the observed objects expire. The implementation of Observed::notifyObservers() checks to make sure that the observers still exist before update()ing them, and only the live observers are updated.
The problem is that sometimes, after wxWidgets destroys a MyWidget, calls to Observed::notifyObservers() crashes with a segfault within the call to i->lock().
I think I see one problem in your code, but it cannot explain a segfault within i->lock(). When MyWidget is destroyed, if at the exact same time an Observed locks its weak_ptr to the wxObserver, the T* p pointer will dangle.
I can see that, and I don't see an easy solution. I tried modifying ~MyWidget so that it explicitly set the T* p pointer to NULL, but that didn't change the behavior.
Can you post a stack trace? The problem you describe sounds like a race between notifyObservers and ~Observed or register, is that possible?
A stack trace is below this paragraph. It seems impossible for there to be a race between anything and ~Observed, since the Observed in this case (an instance of the ObjectDB class) is a long-lasting object that shouldn't be deleted until the end of the application. gdb shows only one thread in the application, which makes race conditions seem even more unlikely to me. (One note: my Observed class is really called "Subject", which is why the trace below includes Subject::notifyObservers() and not Observed::notifyObservers(). #0 0x000000000041c711 in boost::detail::atomic_conditional_increment (pw=0x2b3ba59cdae8) at vendor/include/boost/detail/sp_counted_base_gcc_x86.hpp:92 #1 0x000000000041c743 in boost::detail::sp_counted_base::add_ref_lock (this=0x2b3ba59cdae0) at vendor/include/boost/detail/sp_counted_base_gcc_x86.hpp:138 #2 0x000000000041d7a1 in shared_count (this=0x7fff083b7678, r=@0x70f4a78) at vendor/include/boost/detail/shared_count.hpp:362 #3 0x00002b3ba3bcb6ff in shared_ptr<Observer> (this=0x7fff083b7670, r=@0x70f4a70) at vendor/include/boost/shared_ptr.hpp:187 #4 0x00002b3ba3bcb745 in boost::weak_ptr<Observer>::lock (this=0x70f4a70) at vendor/include/boost/weak_ptr.hpp:106 #5 0x00002b3ba3bcac9b in Subject::notifyObservers (this=0x5a91e0) at src/core/Observer.cpp:40 #6 0x00002b3ba3bc320d in ObjectDB::createObject (this=0x5a91e0, parent=@0x7fff083b7900, type=@0x7fff083b78f0, label=@0x7fff083b78e0, _guid=@0x7fff083b78d0) at src/core/ObjectDB.cpp:101
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Buddha Buck wrote:
#0 0x000000000041c711 in boost::detail::atomic_conditional_increment (pw=0x2b3ba59cdae8) at vendor/include/boost/detail/sp_counted_base_gcc_x86.hpp:92 #1 0x000000000041c743 in boost::detail::sp_counted_base::add_ref_lock (this=0x2b3ba59cdae0) at vendor/include/boost/detail/sp_counted_base_gcc_x86.hpp:138 #2 0x000000000041d7a1 in shared_count (this=0x7fff083b7678, r=@0x70f4a78) at vendor/include/boost/detail/shared_count.hpp:362 #3 0x00002b3ba3bcb6ff in shared_ptr<Observer> (this=0x7fff083b7670, r=@0x70f4a70) at vendor/include/boost/shared_ptr.hpp:187 #4 0x00002b3ba3bcb745 in boost::weak_ptr<Observer>::lock (this=0x70f4a70) at vendor/include/boost/weak_ptr.hpp:106 #5 0x00002b3ba3bcac9b in Subject::notifyObservers (this=0x5a91e0) at src/core/Observer.cpp:40 #6 0x00002b3ba3bc320d in ObjectDB::createObject (this=0x5a91e0, parent=@0x7fff083b7900, type=@0x7fff083b78f0, label=@0x7fff083b78e0, _guid=@0x7fff083b78d0) at src/core/ObjectDB.cpp:101
Very odd address space layout; the shared_ptr code at 0x00002b3ba3bcb6ff and the shared_count/sp_counted_base code at 0x000000000041c711. The 'this' pointer for the sp_counted_base object - 0x2b3ba59cdae0 - seems suspicious, too close to code, too far from the this pointer for the weak_ptr (0x70f4a70) which should also be on the heap. Hard to say where things go wrong. Can you post the assembly around the faulting location? Might it be possible for the two code segments, 2b3ba3.. and 41c.., to be coming from two separate versions of Boost?
On 7/11/07, Peter Dimov
Very odd address space layout; the shared_ptr code at 0x00002b3ba3bcb6ff and the shared_count/sp_counted_base code at 0x000000000041c711. The 'this' pointer for the sp_counted_base object - 0x2b3ba59cdae0 - seems suspicious, too close to code, too far from the this pointer for the weak_ptr (0x70f4a70) which should also be on the heap. Hard to say where things go wrong. Can you post the assembly around the faulting location? Might it be possible for the two code segments, 2b3ba3.. and 41c.., to be coming from two separate versions of Boost?
I found the problem, and it is such a stupid cause I hesitate to
mention it here for fear of looking foolish.
If you recall, my notifyObservers() function looked like:
void Observed::notifyObservers()
{
vector
participants (2)
-
Buddha Buck
-
Peter Dimov