[proposal] Atomically Thread-Safe Static Initialization For Boost...

Here is the C++ pseudo-code: namespace atomic { template<typename T> class var; template<typename T> class once; namespace mb { enum naked_e { naked }; enum fence_e { fence }; enum release_e { release }; enum depends_e { depends }; enum acquire_depends_e { acquire_depends }; } namespace externlib_api { class hashed_mutex; template<typename T> extern T load(T volatile*, mb::naked_e) const throw(); template<typename T> extern T load(T volatile*, mb::depends_e) const throw(); template<typename T> extern void store(T volatile*, T const&, mb::release_e) throw(); template<typename T> extern void store(T volatile*, T const&, mb::naked_e) throw(); template<typename T> extern bool cas(T volatile*, T const&, T const&, mb::acquire_depends_e) throw(); template<typename T> extern bool cas(T volatile*, T const&, T const&, mb::fence_e) throw(); } template<typename T> class var { mutable T volatile m_state; public: var() throw() {} var(T const &state) throw() : m_state(state) {} public: inline T load(mb::depends_e) const throw() { return externlib_api::load(&m_state, mb::depends); } inline T load(mb::naked_e) const throw() { return externlib_api::load(&m_state, mb::naked); } public: inline void store(T const &xchg, mb::fence_e) throw() { externlib_api::store(&m_state, xchg, mb::fence); } inline void store(T const &xchg, mb::naked_e) throw() { externlib_api::store(&m_state, xchg, mb::naked); } public: inline bool cas(T const &cmp, T const &xchg, mb::fence_e) throw() { return externlib_api::cas(&m_state, cmp, xchg, mb::fence); } inline bool cas(T const &cmp, T const &xchg, mb::acquire_depends_e) throw() { return externlib_api::cas(&m_state, cmp, xchg, mb::acquire_depends); } }; template<typename T> class once { var<T*> m_state; var<intword_t> m_count; public: once() throw() : m_state(0), m_count(1) {} ~once() throw() { if (try_dec()) { try { delete local; } catch(...) { assert(false); throw; } } } private: bool try_dec() throw() { intword_t local; do { local = m_count.load(mb::naked); if (! local || local < 1) { return false; } // use mb::fence to cover *both acquire and release // wrt the result of the decrement } while(! m_count.cas(local, local - 1, mb::fence)); return (local == 1); } bool try_inc() throw() { intword_t local; do { local = m_count.load(mb::naked); if (! local || local < 1) { return false; } } while(! m_count.cas(local, local + 1, mb::acquire_depends)); return true; } public: // atomic load / thread-saftey: strong T* load() const { T *local = m_state.load(mb::depends); if (! local) { externlib_api::hashed_mutex::guard_t const &lock(this); if (! try_inc()) { return 0; } local = m_state.load(mb::naked); if (! local) { // call try_dec on exceptions... local = new T; m_state.store(local, mb::release); } } else if(! try_inc()) { return 0; } return local; } // dec the refcount void dec() { if (try_dec()) { delete local; } } }; } okay, here is sample usage now: static atomic::once<foo> g_foo; void some_threads(...) { for(;;) { // whatever... foo *myfoo = g_foo.load(); if (myfoo) { // use myfoo... // okay, we are finished with myfoo g_foo.dec(); } } } This should be a workable solution to the static initialization problem... What do you all think?

Darn, I forgot to take out the redundant assertions again! Okay, every line in the pseudo-code that looks like this:
if (! local || local < 1) { return false; }
can look like this: if (local < 1) { return false; }

Chris Thomasson wrote:
Here is the C++ pseudo-code:
Please don't get me wrong. I very much appreciate your efforts, but since I am very short of time (I think I am not the only one), could you please also use some plain English words to describe your idea, so one can judge if it is worth spending the time studying your code. Also I did not see any response from you how your proposal deviates from previous similar ones with respect to: Initialization order problem, i.e beeing able to be called from within global ctors. Solving destruction of static objects (preferably at process shutdown). Thank you Roland

"Roland Schwarz" <roland.schwarz@chello.at> wrote in message news:4549A086.3050005@chello.at...
Chris Thomasson wrote:
Here is the C++ pseudo-code:
Please don't get me wrong. I very much appreciate your efforts, but since I am very short of time (I think I am not the only one), could you please also use some plain English words to describe your idea, so one can judge if it is worth spending the time studying your code.
I am very sorry here... I too am sometimes short of time, so I have to blast some pseudo-code out in order to get my point across, rather than doing a quick write-up... I promise you that once I get my atomic::once out of the pseudo-code stage, and transform it into a fully working prototype it will have documentation... ;^)
Also I did not see any response from you how your proposal deviates from previous similar ones with respect to:
Initialization order problem, i.e beeing able to be called from within global ctors.
I removed the ctor from atomic::once, but I need to think about that some more here...
Solving destruction of static objects (preferably at process shutdown).
I am using an augmented type of reference counting for this. It will dtor the static object only during/after the dtor for atomic::once runs and when the count drops to (-1). I used to wait until it dropped to zero in the version with the ctor because I was using it to init to 1.

Chris Thomasson wrote:
I promise you that once I get my atomic::once out of the pseudo-code stage, and transform it into a fully working prototype it will have documentation...
I would be interested in why you think once is superior to statically initializeable mutex? My personal believe is, if we have a clean (static) mutex we should prefer it, because it is one concept less the programmer has to learn and remeber. "once" concept was and is just a work-around a deficiency. Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
I would be interested in why you think once is superior to statically initializeable mutex?
My personal believe is, if we have a clean (static) mutex we should prefer it, because it is one concept less the programmer has to learn and remeber.
"once" concept was and is just a work-around a deficiency.
"Once" is more low-level. It can be used to initialize a mutex, *or any other type of object*. Yes, if you have a static mutex, you can lock the mutex around the initialization of another object, but that's unnecessarily complex: void f() { static mutex m; m.lock(); static some_class x; m.unlock(); // access x } You can't use a scoped_lock for this unless it has an unlock() member (which IMO defeats the point of it being scoped), since otherwise you end up serializing the "access x" part too. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
"Once" is more low-level. It can be used to initialize a mutex, *or any other type of object*. Yes, if you have a static mutex, you can lock the mutex around the initialization of another object, but that's unnecessarily complex:
Sorry, but I disagree. My reasoning is that only one concept is better than two for the same purpose, i.e. lock, instead of two i.e. lock and once. Once you understand locking, you necessarily understand protection by locks. On the other hand you will need to look up what once exactly is about the first time you encounter it. To me at least its purpose was not obvious on first encounter.
You can't use a scoped_lock for this unless it has an unlock() member (which IMO defeats the point of it being scoped), since otherwise you end up serializing the "access x" part too.
I don't think so: void f() { static mutex m; scoped_lock lk(m); static some_class x; m.unlock(); // access x } You cannot defeat the purpose of scoped_lock! In this case it is conditionally unlocked when the scope is left. This in the first place is, why Peter Dimov came up with an example that proved me that you need be able to do locking directly on the mutex too. This also is why the scoped_lock isn't thread safe. In the destructor it has to check an unprotected variable that tests the current lock status. Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
"Once" is more low-level. It can be used to initialize a mutex, *or any other type of object*. Yes, if you have a static mutex, you can lock the mutex around the initialization of another object, but that's unnecessarily complex:
Sorry, but I disagree. My reasoning is that only one concept is better than two for the same purpose, i.e. lock, instead of two i.e. lock and once.
Once you understand locking, you necessarily understand protection by locks. On the other hand you will need to look up what once exactly is about the first time you encounter it. To me at least its purpose was not obvious on first encounter.
OK, call it "locked initialization", or "thread-safe initialization" or "synchronized initialization". IMO, the concept of "once" is more fundamental than a mutex, and we should certainly provide both.
You can't use a scoped_lock for this unless it has an unlock() member (which IMO defeats the point of it being scoped), since otherwise you end up serializing the "access x" part too.
I don't think so:
void f() { static mutex m; scoped_lock lk(m); static some_class x; m.unlock(); // access x }
Surely you mean lk.unlock()? Otherwise scoped_lock will try and unlock the mutex again. I did say "unless it has an unlock() member" above. How is this different from using straight-forward lock() and unlock() functions on the mutex?
You cannot defeat the purpose of scoped_lock! In this case it is conditionally unlocked when the scope is left. This in the first place is, why Peter Dimov came up with an example that proved me that you need be able to do locking directly on the mutex too.
Using scoped lock here when all you really want is lock and unlock just seems wrong to me. The lock isn't scoped, since there's an explicit unlock. I generally feel uneasy about scoped_lock::unlock, since it makes reasoning about code harder. void f(mutex& m) { scoped_lock lk(m); if(xyz()) { lk.unlock(); } // is lk locked or not? }
This also is why the scoped_lock isn't thread safe. In the destructor it has to check an unprotected variable that tests the current lock status.
I know that. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

"Chris Thomasson" <cristom@comcast.net> writes:
once() throw() : m_state(0), m_count(1) {}
static atomic::once<foo> g_foo;
void some_threads(...) { for(;;) { // whatever...
foo *myfoo = g_foo.load(); if (myfoo) { // use myfoo...
// okay, we are finished with myfoo g_foo.dec(); }
} }
This should be a workable solution to the static initialization problem...
What do you all think?
your once<> class has a constructor, so it is therefore dynamically initialized, rather than statically, so there is a potential race condition in calling the constructor. Not only might multiple threads run the constructor, but some threads might get on to the load() call before the constructor is complete, because the constructor is running in another thread. Oh, and if the constructor is run multiple times, when is the destructor called, and how many times? This is particularly apparent in void some_thread_func() { static atomic::once<foo> g_foo; foo *myfoo = g_foo.load(); if (myfoo) { } } since g_foo won't be initialized until the first call. The problem still exists for global objects, but is rarer, since it will only happen with threads that run before main(), or as a consequence of the general global-initialization-order problem, which is independent of threads. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

[...]
your once<> class has a constructor, so it is therefore dynamically initialized, rather than statically, so there is a potential race condition in calling the constructor.
Here is one without a ctor: template<typename T> class once { var<T*> m_state; var<intword_t> m_count; public: // once() throw() : m_state(0), m_count(0) {} // m_state and m_count must be init to NULL! ~once() throw() { if (try_dec()) { try { delete local; } catch(...) { assert(false); throw; } } } private: bool try_dec() throw() { intword_t local; do { local = m_count.load(mb::naked); if (local < 0) { return false; } // use mb::fence to cover *both acquire and release // wrt the result of the decrement } while(! m_count.cas(local, local - 1, mb::fence)); return (local == -1); } bool try_inc() throw() { intword_t local; do { local = m_count.load(mb::naked); if (local < 0) { return false; } } while(! m_count.cas(local, local + 1, mb::acquire_depends)); return true; } public: // atomic load / thread-saftey: strong T* load() const { T *local = m_state.load(mb::depends); if (! local) { externlib_api::hashed_mutex::guard_t const &lock(this); if (! try_inc()) { return 0; } local = m_state.load(mb::naked); if (! local) { // call try_dec on exceptions... local = new T; m_state.store(local, mb::release); } } else if(! try_inc()) { return 0; } return local; } // dec the refcount void dec() { if (try_dec()) { delete m_state.load(mb::depends); } } }; The destructor runs if m_state is not null and if the refcount drops to -1. I can't really see a problem here...

Chris Thomasson wrote:
Here is one without a ctor:
It still has a dtor. Same as Anmthony already said applies to dtor too. You will end up needing to drop the class and use a function with a flag I suspect. Btw.: The usage scenario you were showing earlier is not quite what we had in mind. We were speaking about local statics. In your example usage, you have moved the static part to global namespace. There are far more less problems. The problems are getting nasty when needing local statics as in void foo() { static bar mybar; } You really might need them, because the static possibly depending an a template parameter, which makes it impossible to move it to global namespace. So you will need to be able to declare the once class inside the function body scope. Back to ctor/dtor. Once you have found out that ctors and dtors are bad, what is left? Well several options: 1) a once function and a flag (instead of a class) 2) a static initializeable mutex 1) and 2) are roughly equivalent in that one can be built out of the other. 1) has been used in pthreads (before a static mutex type was available) Boost.Thread also uses this also mainly in implementation code. 2) I have tried to find out if it is possible to implement a mutex that is statically initializeable with the additional constraint that zero initialization suffices. I have provided a prototype that let me believe it really can be done. (Such a mutex has the additional benefit that you need no explicit memory management.) Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
2) I have tried to find out if it is possible to implement a mutex that is statically initializeable with the additional constraint that zero initialization suffices. I have provided a prototype that let me believe it really can be done. (Such a mutex has the additional benefit that you need no explicit memory management.)
boost/thread/win32/basic_timed_mutex.hpp on the thread-rewrite branch provides a win32-specific statically-initializable-with-zero mutex (that I use as the basis for boost::mutex on win32). The thing is, it does require initialization when used with non-static storage duration. If the constexpr proposal gets accepted into the C++ Standard, we won't have to worry about this in the future --- constructors that just use constant expressions can be tagged "constexpr", and used for static initialization. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
If the constexpr proposal gets accepted into the C++ Standard, we won't have to worry about this in the future --- constructors that just use constant expressions can be tagged "constexpr", and used for static initialization.
What does this mean, can you please illustrate it with an example? Thank you, Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
If the constexpr proposal gets accepted into the C++ Standard, we won't have to worry about this in the future --- constructors that just use constant expressions can be tagged "constexpr", and used for static initialization.
What does this mean, can you please illustrate it with an example?
struct A { constexpr A(int i_): i(i_),j(0) {} int i; int j; }; static A a(6); // static initialization, same as {6,0}, but with a constructor. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
struct A { constexpr A(int i_): i(i_),j(0) {}
int i; int j; };
static A a(6); // static initialization, same as {6,0}, but with a constructor.
And such a struct shall be aggregate? What if there is something in the body of the ctor? Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
struct A { constexpr A(int i_): i(i_),j(0) {}
int i; int j; };
static A a(6); // static initialization, same as {6,0}, but with a constructor.
And such a struct shall be aggregate?
No.
What if there is something in the body of the ctor?
That's not allowed. The members can be initialized with "constant expressions" in the member init list. Constant expressions are generalized to include calls to functions which are themselves marked constexpr. N1980 is the latest public version of the proposal. http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2006/n1980.pdf There is a revised version on the committee wiki, which hopefully will be in the post-Portland mailing. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
Roland Schwarz <roland.schwarz@chello.at> writes:
And such a struct shall be aggregate?
No.
Don't understand: 8.5.1/1 An aggregate is an array or class with no user-declared constructors, no private or protected non-static data members, no base classes, and no virtual functions. Our example class would fall out of this definition? Can't imagine why it should. To me the concept simply appears as an extension to aggregate initializers. Isn't it?
N1980 is the latest public version of the proposal.
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2006/n1980.pdf Thank you.
Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
Roland Schwarz <roland.schwarz@chello.at> writes:
And such a struct shall be aggregate?
No.
Don't understand:
8.5.1/1 An aggregate is an array or class with no user-declared constructors, no private or protected non-static data members, no base classes, and no virtual functions.
Our example class would fall out of this definition? Can't imagine why it should. To me the concept simply appears as an extension to aggregate initializers. Isn't it?
The class has a user-declared constructor, so it isn't an aggregate. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
The class has a user-declared constructor, so it isn't an aggregate.
But a constructor in the sense of an initializer list. I.e. something that can be converted to a POD and moved in at static init time. This behavior is what makes it different to a normal class. But perhaps they will invent a new name.... Roland

"Roland Schwarz" <roland.schwarz@chello.at> wrote in message news:454A9E56.8050901@chello.at...
Chris Thomasson wrote:
Here is one without a ctor:
It still has a dtor. Same as Anmthony already said applies to dtor too.
You will end up needing to drop the class and use a function with a flag I suspect.
Yup. I think so...
Btw.: The usage scenario you were showing earlier is not quite what we had in mind. We were speaking about local statics.
Ahhh... Okay, I was thinking global.
In your example usage, you have moved the static part to global namespace. There are far more less problems.
Indeed! :^)
The problems are getting nasty when needing local statics as in
void foo() { static bar mybar;
}
You really might need them, because the static possibly depending an a template parameter, which makes it impossible to move it to global namespace.
So you will need to be able to declare the once class inside the function body scope.
I see.
Back to ctor/dtor. Once you have found out that ctors and dtors are bad, what is left? Well several options:
1) a once function and a flag (instead of a class) 2) a static initializeable mutex
1) and 2) are roughly equivalent in that one can be built out of the other.
Agreed. However, one can be lock-free after the initialization takes place... IMHO, this is a fairly important aspect of this type of algorithm. I am going to give my pseudo-code one more shot at the end of the post in the form of compliable code that uses my AppCore library. This time, no class, just a single C function. I urge you to bear with me, and try to take a quick look at it... Thank you all for your time and patience! :^)
1) has been used in pthreads (before a static mutex type was available) Boost.Thread also uses this also mainly in implementation code.
2) I have tried to find out if it is possible to implement a mutex that is statically initializeable with the additional constraint that zero initialization suffices. I have provided a prototype that let me believe it really can be done. (Such a mutex has the additional benefit that you need no explicit memory management.)
You can do a spinlock for sure... How are you setting the mutex's waitset? Are you deferring waitset allocation until first point of contention (e.g., lazy mutex) ? Okay here is a link to AppCore: http://appcore.home.comcast.net/ And here is the once code; please tell me what you think of my technique: once-experimental.cpp --------------- #include <cassert> #include <cstdio> #include <appcore.h> // User API Decl namespace atomic { template<typename T> class once_ptr; } // namespace atomic // System API Decl namespace atomic { namespace sys { typedef ac_intword_t refs_t; template<typename T> struct once_POD; template<typename T> struct once_def_POD; static bool once_inc(refs_t*) throw(); static bool once_dec(refs_t*) throw(); static void dbg_allocs_inc() throw(); static void dbg_allocs_dec() throw(); }} // namespace atomic::sys // System API Def namespace atomic { namespace sys { // Debug counter static ac_intword_t dbg_allocs = 0; // Inc the debug counter void dbg_allocs_inc() throw() { ac_intword_t allocs = ac_atomic_inc_acquire(&dbg_allocs); std::printf("atomic::sys::dbg_allocs_inc - %i\n", allocs); } // Inc the debug counter void dbg_allocs_dec() throw() { ac_intword_t allocs = ac_atomic_dec_acquire(&dbg_allocs); std::printf("atomic::sys::dbg_allocs_dec - %i\n", allocs); } // Inc the refcount if its >= 0. // returns true if the refcount was inc'd, // otherwise false bool once_inc(refs_t *_this) throw() { refs_t local, cmp; do { local = ::ac_mb_load_naked(_this); if (local < 0) { return false; } cmp = local; local = ac_atomic_cas_acquire(_this, local, local + 1); } while(cmp != local); std::printf("atomic::sys::once_inc(); - %i\n", local + 1); return true; } // Dec the refcount if its >= 0. // returns true for the last ref, // otherwise false bool once_dec(refs_t *_this) throw() { refs_t local, cmp; do { local = ::ac_mb_load_naked(_this); if (local < 0) { return false; } cmp = local; local = ac_atomic_cas_acquire(_this, local, local - 1); } while(cmp != local); std::printf("atomic::sys::once_dec(); - %i\n", local - 1); if (local == 1) { ac_mb_store_release(_this, 0); return true; } else if (! local) { return true; } return false; } // once is a POD #define ATOMIC_ONCE_SYS_STATICINIT() {0, 0} template<typename T> struct once_POD { typedef T type_t; typedef once_def_POD<once_POD> define_POD_t; refs_t m_refs; type_t *m_state; }; // define is a POD that holds a once POD template<typename T> struct once_def_POD { typedef typename T::type_t type_t; static T s_this; // Acquires a reference and calls ctor for first ref. // returns pointer to ref, // otherwise NULL type_t* acquire() { type_t *local = (type_t*)ac_mb_loadptr_depends(&s_this.m_state); if (! local) { // hashed_mutex::guard_t lock(&_this); local = (type_t*)ac_mb_loadptr_depends(&s_this.m_state); if (! once_inc(&s_this.m_refs)) { return 0; } if (! local) { try { local = new type_t; } catch(...) { (void)once_dec(&s_this.m_refs); throw; } dbg_allocs_inc(); std::printf("\n(%p)once_def_POD::acquire(); - %p\n", (void*)this, (void*)local); ac_mb_storeptr_release(&s_this.m_state, local); } } else if(! once_inc(&s_this.m_refs)) { return 0; } return local; } // Releases a reference and calls dtor if last ref. // returns nothing, // otherwise false void release() { type_t *local = (type_t*)ac_mb_loadptr_depends(&s_this.m_state); if (once_dec(&s_this.m_refs)) { if (local) { ac_atomic_casptr_acquire(&s_this.m_state, local, 0); delete local; std::printf("(%p)once_def_POD::release(); - %p\n\n", (void*)this, (void*)local); dbg_allocs_dec(); } } } // static POD init }; template<typename T> T once_def_POD<T>::s_this = ATOMIC_ONCE_SYS_STATICINIT(); }} // namespace atomic::sys // User API Def namespace atomic { // Holds an acquired reference. template<typename T> class once_ptr { public: typedef typename sys::once_POD<T>::define_POD_t define_POD_t; private: define_POD_t *m_once; T *m_state; private: T* acquire() { return (m_once) ? m_once->acquire() : 0; } void release() { if (m_once) { assert(m_state); m_once->release(); } } public: once_ptr() throw() : m_once(0), m_state(0) {} once_ptr(define_POD_t &_once) : m_once(&_once), m_state(_once.acquire()) {} ~once_ptr() { release(); } public: once_ptr(once_ptr const &rhs) : m_once(rhs.m_once), m_state(rhs.acquire()) {} once_ptr const& operator =(once_ptr &rhs) { define_POD_t *old = m_once; if (old != rhs.m_once) { define_POD_t *_once = rhs.m_once; T *_state = rhs.acquire(); release(); m_once = _once; m_state = _state; } return *this; } once_ptr const& operator =(define_POD_t &rhs) { define_POD_t *old = m_once; if (old != &rhs) { define_POD_t *_once = &rhs; T *_state = rhs.acquire(); release(); m_once = _once; m_state = _state; } return *this; } public: T* load() const throw() { return m_state; } public: T* operator ->() { return load(); } T& operator *() { return *load(); } public: operator bool() throw() { return (m_state != 0); } bool operator !() throw() { return (m_state == 0); } }; } // namespace atomic // Here is sample usage: struct foo1 { typedef atomic::once_ptr<foo1> once_ptr_t; void whatever() { std::printf("(%p)foo1::whatever();\n", (void*)this); } foo1() { std::printf("(%p)foo1::foo1();\n", (void*)this); } ~foo1() { std::printf("(%p)foo1::~foo1();\n", (void*)this); } }; static void funca_for_multiple_threads() { static foo1::once_ptr_t::define_POD_t s_foo; foo1::once_ptr_t myfoo1(s_foo); if (myfoo1) { myfoo1->whatever(); } } struct foo2 { typedef atomic::once_ptr<foo1> once_ptr_t; void whatever() { std::printf("(%p)foo2::whatever();\n", (void*)this); funca_for_multiple_threads(); } foo2() { static foo1::once_ptr_t::define_POD_t s_foo; foo1::once_ptr_t myfoo1(s_foo); if (myfoo1) { myfoo1->whatever(); } std::printf("(%p)foo2::foo2();\n", (void*)this); } ~foo2() { std::printf("(%p)foo2::~foo2();\n", (void*)this); } }; struct foo3 { typedef atomic::once_ptr<foo3> once_ptr_t; static foo1::once_ptr_t::define_POD_t s_foo1; void whatever() { std::printf("(%p)foo3::whatever();\n", (void*)this); foo1::once_ptr_t myfoo1(s_foo1); if (myfoo1) { myfoo1->whatever(); } funca_for_multiple_threads(); } foo3() { static foo2::once_ptr_t::define_POD_t s_foo2; foo2::once_ptr_t myfoo2(s_foo2), myfoo2a; if (myfoo2) { myfoo2->whatever(); foo1::once_ptr_t myfoo1(s_foo1); myfoo2a = myfoo2; if (myfoo1) { myfoo1->whatever(); } myfoo2a->whatever(); } std::printf("(%p)foo3::foo3();\n", (void*)this); } ~foo3() { whatever(); std::printf("(%p)foo3::~foo3();\n", (void*)this); } }; foo1::once_ptr_t::define_POD_t foo3::s_foo1; static void funcb_for_multiple_threads() { funca_for_multiple_threads(); static foo2::once_ptr_t::define_POD_t s_myfoo2a; funca_for_multiple_threads(); foo2::once_ptr_t smyfooa(s_myfoo2a), smyfooaa; if (smyfooa) { smyfooa->whatever(); } { funca_for_multiple_threads(); static foo2::once_ptr_t::define_POD_t s_myfoo2b; foo1 myfoo1; foo2::once_ptr_t smyfooa(s_myfoo2a); if (smyfooa) { smyfooaa = smyfooa; smyfooaa->whatever(); } foo2 myfoo2; { myfoo1.whatever(); foo2::once_ptr_t smyfoo2(s_myfoo2a); if (smyfoo2) { smyfoo2->whatever(); } myfoo2.whatever(); funca_for_multiple_threads(); { foo3 myfoo3; myfoo3.whatever(); foo2 myfoo2; myfoo2.whatever(); } myfoo2.whatever(); } foo3 myfoo3; foo2::once_ptr_t smyfoo2(s_myfoo2b); if (smyfoo2) { smyfoo2->whatever(); funca_for_multiple_threads(); myfoo3.whatever(); } funca_for_multiple_threads(); } if (smyfooaa) { smyfooaa->whatever(); } funca_for_multiple_threads(); } int main(int argc, char *argv[]) { funca_for_multiple_threads(); funcb_for_multiple_threads(); funca_for_multiple_threads(); funcb_for_multiple_threads(); return 0; }

[...]
This time, no class, just a single C function. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^6
strike that; it was suppose to read: This time there is no ctor or dtor, just two POD types and a smart pointer. sorry for any confusion! ;^(... Anyway, I think I created a correct atomically thread-safe static initialization algorithm this time... :^)

I found and killed a race-condition; the following three functions need to be changed: atomic::sys::once_dec(...) template<typename T> atomic::sys::once_def_POD<T>::release() Here is the fix: // System API Def namespace atomic { namespace sys { [...] // Dec the refcount if its >= 0. // returns true for the last ref, // otherwise false bool once_dec(refs_t *_this) throw() { refs_t local, cmp, xchg; do { local = ::ac_mb_load_naked(_this); if (local < 0) { return false; } xchg = (local == 1) ? -1 : local - 1; cmp = local; local = ac_atomic_cas_release(_this, local, xchg); } while(cmp != local); std::printf("atomic::sys::once_dec(); - %i\n", local - 1); return (local == 1); } [...] // define is a POD that holds a once POD template<typename T> struct once_def_POD { [...] // Acquires a reference and calls ctor for first ref. // returns pointer to ref, // otherwise NULL type_t* acquire() { type_t *local = 0, *cmp; do { if (local) { ac_atomic_dec_release(&s_this.m_refs); } local = (type_t*)ac_mb_loadptr_depends(&s_this.m_state); if (! local) { // hashed_mutex::guard_t lock(&_this); local = (type_t*)ac_mb_loadptr_depends(&s_this.m_state); if (! once_inc(&s_this.m_refs)) { return 0; } if (! local) { try { local = new type_t; } catch(...) { (void)once_dec(&s_this.m_refs); throw; } dbg_allocs_inc(); std::printf("\n(%p)once_def_POD::acquire(); - %p\n", (void*)this, (void*)local); ac_mb_storeptr_release(&s_this.m_state, local); return local; } } else if(! once_inc(&s_this.m_refs)) { return 0; } cmp = (type_t*)ac_mb_loadptr_depends(&s_this.m_state); } while(local != cmp); return local; } // Releases a reference and calls dtor if last ref. // returns nothing, // otherwise false void release() { type_t *local = (type_t*)ac_mb_loadptr_depends(&s_this.m_state); if (once_dec(&s_this.m_refs)) { ac_mb_storeptr_fence(&s_this.m_state, 0); ac_mb_store_fence(&s_this.m_refs, 0); if (local) { delete local; std::printf("(%p)once_def_POD::release(); - %p\n\n", (void*)this, (void*)local); dbg_allocs_dec(); } } } }; [...] }} // namespace atomic::sys I attached the complete fixed code; once-experimental.cpp. The race-condition allowed a thread to load a state pointer and inc a reference count to a static object that was possibly being dtor'd. I fixed this by zeroing the state pointer and invalidating the reference count when the count dropped to zero. I also had to make a thread double check the load of a state pointer after incrementing a valid reference count. Now I can't seem to find any other race-conditions after this... Can you? ;^) IMHO, this particular algorithm may be worth spending a little time on... Humm... begin 666 once-experimental.cpp`` ` end

Okay... The last algorithm I posted in this thread actually works. It shows how to do atomically thread-safe reference counted static initialization of c++ objects which adhere to a thread-safety level of strong. I am almost finished creating a technical white paper which will give a fairly detailed description of every aspect of my algorithm, and the implementation details of a fully working prototype. After reading it, you should have no problem implementing and experimenting with my algorithm for yourself. The basic concept of my technique is, IMHO, very straightforward and fairly simple to understand: -- Correct multi-threaded referenced counted static initialization of C++ objects can be realized through the interface of a smart pointer which wraps a low-level API that uses atomic operations to access the contents of a statically initialized POD. The paper and library will be posted today or tomorrow. I think I have a good solution to this particular "C++ problem". IMHO, my static initialization algorithm works well with and could be an asset to the Boost Library. Therefore, I am really looking forward to, and will be very interested in, reading any comments/critiques/suggestions on my technique the Boost community may have. Thank you all for your time, Chris Thomasson

Here is a very rough draft of a paper I am creating that describes how my static initialization algorithms is implemented. The document is certinaly not incomplete, however I thought that I should try to gather any feedback I can before I go ahead and move the paper out of the "rough draft " stage: http://appcore.home.comcast.net/vzdoc/atomic/static-init/ I am look forward to reading any comment you may have. Thank you.

"Chris Thomasson" <cristom@comcast.net> wrote in message news:eiu626$sgo$1@sea.gmane.org... [...]
[...] I have not forgot about this paper. I got sidetracked with some other business that involves $... ;^) So, I will definitely let you know when the code that goes with the paper is complete. BTW, I posted the refcounting functions that are missing in the paper, in this thread. Any comments on my technique?
participants (3)
-
Anthony Williams
-
Chris Thomasson
-
Roland Schwarz