[thread] A simply static initializeable mutex

Triggered by the question how to do static initialization of a mutex, I tried to write one that is. The most basic initialization that wil be performed by the compiler, is zero initialization of static (POD) objects. This kind of initialization will take place before any other initializations. Consequently the mutex is initialized by its mere declaration. However this does come at a price: 1) Since the mutex has no ctor, one must not forget to initialize it when using it as a member of some containing object. But this is the same as with other POD's. 2) Altough the mutex is copyable this does not make any sense. The identifying property of a mutex is its memory address. Trying to lock a copied mutex will get you in trouble. It is not possible to protect against such usage since we have no ctor that could be declared private. On the other side you can make a object that contains a mutex copyable by simply reinitializing it to zero, which is not possible with the current mutex, since copying is strictly prohibited. I appended to this mail my first attempt to implement such a mutex . The code is commented and a pingpong test is included as example. So far this is only for windows. I would be very interested to get feedback about this idea. Roland // Copyright (C) 2006 Roland Schwarz // // Distributed under the Boost Software License, Version 1.0. (See accompanying // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) #ifndef MUTEX_HPP__ #define MUTEX_HPP__ class scoped_lock; class semaphore; typedef scoped_lock* volatile mutex; class scoped_lock { public: scoped_lock(mutex& m, bool initially_locked = true); ~scoped_lock(); void lock(); void unlock(); private: bool is_locked; mutex* pmutex; scoped_lock* next_lock; scoped_lock* pred_lock; semaphore* volatile next_flag; semaphore* volatile sema_flag; semaphore* sema; void set(semaphore* volatile* pflag); void wait_for(semaphore* volatile* pflag); }; #endif // Copyright (C) 2006 Roland Schwarz // // Distributed under the Boost Software License, Version 1.0. (See accompanying // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) // This is an experimental implementation of a mutex // that is of type POD. I.e. typedef scoped_lock* volatile mutex; // An unowned mutex has value 0. Consequently there is no need // for dynamic initialization, which solves the static // initialization problem. // Furthermore it also is not necessary to e.g. // mutex mx = 0; because static objects are zero initialized // by default, before any other initializations take place. // As such, availability of this mutex type makes a once // function unnecessary. // // Memory and resource management is fully automatic. I.e. // no leaks under normal usage. (In the basic version no cleanup // in the atexit handler is required.) The implementation needs // availability of some atomic primitives and semaphores // from the operating system. // // The required data structures are all within a scoped_lock // object. Destruction of the lock also frees up the resources. // // Typical usage: // mutex mx; // void foo() // { // scoped_lock lk(mx); // ... // } // // How does it work: // The scoped_lock object contains a doubly linked list. // The mutex is pointing to the head of this list. When // the mutex is zero the list ist empty and the mutex // unowned. // Locking the mutex means putting a list entry to the // head of the list. This is done atomically. Concurrent // requests are queued up, and each put on a separate // semaphore. This scheme already could be used by itself // (see e.g mcs-locks in win32 pthreads), however it // imposes a FIFO scheduling behaviour. // This implementation instead uses this queue only during // the build up, and then hands over the waiting to a // shared semaphore, to let the system scheduler wake up // the threads. The doubly linked list makes it possible // to remove list items from the list in an arbitrary // order. // // Performance: // I did only a minimalistic "pingpong" test, which showed // comparable performance to the current boost::mutex. // I expect that the performance can be improved, if the // semaphore objects are not completeley destroyed after // temporary use, but instead put in a lock free container // for later reuse. This however would raise the requirement // to destroy them during atexit. // // Missing parts: // I havent tried yet if it is possible to also provide // try and timed lock variants. // Altough I tried to take care of the necessary memory // barriers for SMP, I am not convinced yet that I got it // straight in this first attempt. // Also I have not yet been able to prove the correctness of // the algorithm. #include "mutex.hpp" // The operating system dependent part #include <windows.h> #include <limits.h> #include <cassert> template<class T> inline T* atomic_swap_full(T* volatile * p, T* a) { return (T*)InterlockedExchangePointer(p,a); } template<class T> inline T* atomic_load_full(T* volatile *p) { return (T*)InterlockedExchangeAdd((LPLONG)p,0); } template<class T> inline bool atomic_bool_compare_and_swap_full(T* volatile *p, T* c, T* s) { return (c == (T*)InterlockedCompareExchange((LPLONG)p,(LONG)s,(LONG)c)); } template<class T> inline T* atomic_compare_and_swap_full(T* volatile *p, T* c, T* s) { return (T*)InterlockedCompareExchange((LPLONG)p,(LONG)s,(LONG)c); } class semaphore { public: // The open and close functions could be extended // to put the closed items onto a lock free // container instead. static semaphore* open(int count = 0) { return new semaphore(count); } static void close(semaphore* sema) { delete sema; } void post() { ReleaseSemaphore(sem, 1, NULL); } void wait() { WaitForSingleObject(sem, INFINITE); } private: semaphore(int count) { sem = CreateSemaphore(NULL, count, LONG_MAX, NULL); } ~semaphore() { CloseHandle(sem); } HANDLE sem; }; // The following should be operating system independent scoped_lock::scoped_lock(mutex& m, bool initially_locked) : is_locked(false) { pmutex = &m; if (initially_locked) lock(); } scoped_lock::~scoped_lock() { if (is_locked) unlock(); } void scoped_lock::lock() { assert(!is_locked); next_lock = 0; pred_lock = 0; sema_flag = 0; next_flag = 0; scoped_lock* pred = atomic_swap_full(pmutex, this); if (0 != pred) { pred_lock = pred; pred->next_lock = this; wait_for(&pred->sema_flag); // pass on the shared semaphore sema = pred->sema; set(&sema_flag); set(&pred->next_flag); sema->wait(); } else { // create the shared semaphore sema = semaphore::open(); set(&sema_flag); } is_locked = true; } void scoped_lock::unlock() { assert(is_locked); scoped_lock* n = atomic_load_full(&next_lock); if (0 == n) { if (0 != pred_lock) { pred_lock->next_lock = 0; pred_lock->next_flag = 0; } if (atomic_bool_compare_and_swap_full(pmutex,this,pred_lock)) { if (0 == pred_lock) { // We have no predecessor and are the last // to hold this semaphore. Sucessors wil get // a fresh semaphore for this mutex. // Release it. semaphore::close(sema); } else { // We have been removed from the head, but have // a predecessor to wake up. sema->post(); } is_locked = false; return; } wait_for(&next_flag); n = atomic_load_full(&next_lock); } else { wait_for(&next_flag); } if (0 != pred_lock) pred_lock->next_lock = n; n->pred_lock = pred_lock; sema->post(); is_locked = false; } void scoped_lock::set(semaphore* volatile* pflag) { // If unflagged the value is 0. // The special value -1 is getting used when set is called // before a wait takes place. Since this is an illegal memory // address be careful not to call set again without having // reset the flag in between! semaphore* f = atomic_compare_and_swap_full(pflag,(semaphore*)0,(semaphore*)-1); if (0 != f) { f->post(); } } void scoped_lock::wait_for(semaphore* volatile* pflag) { if (0 == atomic_load_full(pflag)) { semaphore* f = semaphore::open(); if (atomic_bool_compare_and_swap_full(pflag,(semaphore*)0,f)) { f->wait(); } semaphore::close(f); } } // Copyright (C) 2006 Roland Schwarz // // Distributed under the Boost Software License, Version 1.0. (See accompanying // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) #include "mutex.hpp" #include <iostream> #include <ctime> #include <boost/thread/thread.hpp> #include <boost/thread/mutex.hpp> #include <boost/thread/condition.hpp> #include <boost/bind.hpp> // set to 0 if you want to test against boost::mutex #if 1 typedef mutex test_mutex; typedef scoped_lock test_lock; #else typedef boost::mutex test_mutex; typedef boost::mutex::scoped_lock test_lock; #endif test_mutex mtx0; test_mutex mtx1; test_mutex mtx2; test_mutex mtx3; boost::mutex smx; boost::condition cs; bool start = false; void pingpong(long int turns) { test_lock lk0(mtx0, false); test_lock lk1(mtx1, true); test_lock lk2(mtx2, false); test_lock lk3(mtx3, true); boost::mutex::scoped_lock lkstart(smx); start = true; cs.notify_one(); lkstart.unlock(); for (long int i=0; i<turns; ++i) { lk0.lock(); lk1.unlock(); lk2.lock(); lk3.unlock(); lk1.lock(); lk0.unlock(); lk3.lock(); lk2.unlock(); } } // this needed about 30 sec on my machine const long int N = 1000000; int main(int argc, char* argv[]) { // pingpong timing test test_lock lk0(mtx0, true); test_lock lk1(mtx1, false); test_lock lk2(mtx2, true); test_lock lk3(mtx3, false); boost::mutex::scoped_lock lkstart(smx); boost::thread thd(boost::bind(pingpong, N)); while (!start) cs.wait(lkstart); std::time_t t0 = std::time(0); for (long int i=0; i<N; ++i) { lk0.unlock(); lk1.lock(); lk2.unlock(); lk3.lock(); lk1.unlock(); lk0.lock(); lk3.unlock(); lk2.lock(); } std::time_t t1 = std::time(0); std::cout << "time: " << t1-t0 << std::endl; thd.join(); return 0; }

Sorry for answering to my own post, but ... I have another small idea to throw in. ;-) The real meaning of a mutex to me always was to protect something by mere knowledge of its identification. For interprocess this usually means use of a named mutex. I have always been under the wrong assumption that this poses a problem for the construction, in which thread comes first. I believed so because named mutex in boost was used to solve the reverse problem (constructing once function) by letting the operating system take care for the race. Since I've seen now that the once problem can be solved without need for a named mutex, I discovered that named mutex is an orthogonal concept. (Sorry if this sounds too trivial.) Perhaps this idea is also too simply to be useful, but I always were looking for a way to be able to say: class foo { ... } afoo; lock lk(&afoo); where &afoo is the process - unique equivalent of a name for a mutex. Unfortunately a mutex is an object that carries state, and foo isn't prepared to store this state. The problem is that this takes too much steps at a time. Bringing the mutex again into play to store the state one could say: mutex& m1 = named_mutex::open(&afoo); which will give back an unique mutex. At another place, only knowing afoo's address we could say: mutex& m2 = named_mutex::open(&afoo); and this will hand out an equivalent mutex. Normally you would need to create a process wide visible mutex to accomplish the same effect, altough knowing an unique identifier, as is the address, would be sufficient. I have attached an example implementation that stores the mutex objects in a global map. The example works with my earlier posted POD mutex, but can be easily modified to work with boost::mutex as well. Sorry if this idea sound too trivial, but I simply did not see it before. If anyone knows a similar implementation, I would be glad to hear about. Roland // Copyright (C) 2006 Roland Schwarz // // Distributed under the Boost Software License, Version 1.0. (See accompanying // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) #ifndef NAMED_MUTEX_HPP__ #define NAMED_MUTEX_HPP__ #include "mutex.hpp" namespace named_mutex { mutex& open(const void* addr); void close(const void* addr); }; #endif // Copyright (C) 2006 Roland Schwarz // // Distributed under the Boost Software License, Version 1.0. (See accompanying // file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) #include "named_mutex.hpp" #include <map> #include <exception> namespace { mutex map_mtx; typedef std::map<const void*, std::pair<mutex*,int> > map_mutex_type; map_mutex_type map_mutex; }; namespace named_mutex { mutex& open(const void* addr) { scoped_lock lk(map_mtx); map_mutex_type::iterator i = map_mutex.find(addr); mutex* pm; if (i == map_mutex.end()) { pm = new mutex(0); map_mutex.insert(std::make_pair(addr, std::make_pair(pm,1))); } else { pm = i->second.first; ++(i->second.second); } return *pm; } void close(const void* addr) { scoped_lock lk(map_mtx); map_mutex_type::iterator i = map_mutex.find(addr); if (i != map_mutex.end()) { if (0 == --(i->second.second)) { delete i->second.first; map_mutex.erase(addr); } } } };

Roland Schwarz <roland.schwarz@chello.at> writes:
class foo { ... } afoo;
lock lk(&afoo);
where &afoo is the process - unique equivalent of a name for a mutex. Unfortunately a mutex is an object that carries state, and foo isn't prepared to store this state.
The problem is that this takes too much steps at a time. Bringing the mutex again into play to store the state one could say:
mutex& m1 = named_mutex::open(&afoo);
which will give back an unique mutex. At another place, only knowing afoo's address we could say:
mutex& m2 = named_mutex::open(&afoo);
and this will hand out an equivalent mutex. Normally you would need to create a process wide visible mutex to accomplish the same effect, altough knowing an unique identifier, as is the address, would be sufficient.
Windows named mutexes do give you exactly this functionality, though as they are kernel objects you don't get the "fast path" options of a roll-your-own mutex. If the name for your mutex includes the process ID and &afoo, then Windows will give you a distinct mutex for each distinct foo object. You need the process ID, so you don't get a clash with mutexes in other processes, since named mutexes are system-global. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
Windows named mutexes do give you exactly this functionality, though as they are kernel objects you don't get the "fast path" options of a roll-your-own mutex.
If the name for your mutex includes the process ID and &afoo, then Windows will give you a distinct mutex for each distinct foo object. You need the process ID, so you don't get a clash with mutexes in other processes, since named mutexes are system-global.
Yes, of course. I already mentioned that the idea might sound all to trivial. A similar mutex also exists e.g. on linux. You are pointing at the similarities. But I want to show the differences. In my case I simply use the standard" process-local mutex, but wrapping it into a "name-generator". In this respect my "idea" is different from op-sys named mutices. op-sys mutices are always system global aren't they? (Given the name is unique.) Consequently my approach gives you a fast-pathed mutex if you need it, without loss of the "named" feature. Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
Windows named mutexes do give you exactly this functionality, though as they are kernel objects you don't get the "fast path" options of a roll-your-own mutex.
But I want to show the differences. In my case I simply use the standard" process-local mutex, but wrapping it into a "name-generator".
Consequently my approach gives you a fast-pathed mutex if you need it, without loss of the "named" feature.
Yes, agreed. In practical terms, the only problems with your scheme are: * Ensuring the that these "named" mutexes are correctly destroyed at process termination, if people don't "close" the mutex. Maybe not an issue. * Contention on the map. Ideally we don't want to have any contention for unrelated mutexes. Chris Thomasson has suggested using a global lock-free hashmap for this sort of thing. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

"Anthony Williams" <anthony_w.geo@yahoo.com> wrote in message news:hcxmjaqw.fsf@yahoo.com...
Roland Schwarz <roland.schwarz@chello.at> writes:
[...]
* Contention on the map. Ideally we don't want to have any contention for unrelated mutexes. Chris Thomasson has suggested using a global lock-free hashmap for this sort of thing.
You have got to be careful when you make use of this technique... http://groups.google.com/group/comp.programming.threads/msg/56406d367ac85dcb You either have to know exactly what your are doing (e.g., only use hashmap the in your libraries system api), or by following another simple method: http://groups.google.com/group/comp.programming.threads/browse_frm/thread/e0... other than those caveats, using a global hashmap for this sort of thing usually works out fine.

Anthony Williams wrote:
In practical terms, the only problems with your scheme are:
* Ensuring the that these "named" mutexes are correctly destroyed at process termination, if people don't "close" the mutex. Maybe not an issue.
This is the same as forgetting a system opened object to be closed. Don't know if named system mutices will get theit refernce decremented when the process exists though.
* Contention on the map. Ideally we don't want to have any contention for unrelated mutexes. Chris Thomasson has suggested using a global lock-free hashmap for this sort of thing.
Yes, of course. I just wanted to present and discuss the interface. Altough the contention is possibly not too high anyways, since one can expect shared objects creation frequency much lower than access frequency. And in my proposed scheme contention only occurs during creation. This would be different if creating a lock in the following manner: class lock { public: lock(const void* addr) { myaddr = addr; pmutex = &named_mutex::open(addr); } ~lock() { named_mutex::close(myaddr); } .... private: mutex* pmutex; const void* myaddr; }; This scheme indeed would give high contention, and only could be realized by means of a very fast concurrent hash map. Roland

----Original Message---- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Roland Schwarz Sent: 29 October 2006 15:45 To: Boost Subject: [boost] [thread] A simply static initializeable mutex
Triggered by the question how to do static initialization of a mutex, I tried to write one that is.
The most basic initialization that wil be performed by the compiler, is zero initialization of static (POD) objects. This kind of initialization will take place before any other initializations.
Consequently the mutex is initialized by its mere declaration. However this does come at a price: [snip careful list of caveats]
You may be able to do a little better. The standard says "The storage for objects with static storage duration (3.7.1) shall be zeroinitialized (8.5) before any other initialization takes place." What is more, most normal operating systems give programs pages of memory that are zero-initialized, so compilers would have to do work to avoid this. I think that means you can declare constructors and STILL have the object well behaved. -- Martin Bonner Martin.Bonner@Pitechnology.com Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB24 6WZ, ENGLAND Tel: +44 (0)1223 203894

Martin Bonner wrote:
You may be able to do a little better. The standard says "The storage for objects with static storage duration (3.7.1) shall be zeroinitialized (8.5) before any other initialization takes place." What is more, most normal operating systems give programs pages of memory that are zero-initialized, so compilers would have to do work to avoid this.
Sorry, I do not understand "so compilers would have to do work to avoid this". Could you please try to state it with different wording?
I think that means you can declare constructors and STILL have the object well behaved.
Do you mean: the object might be already zero initialized before the ctor runs? If yes: this is very dangerous to rely on! 1) "Might": the standard does not require it. 2) E.g.: MSVC initializes memory to "CDCDCDCDCDCD...." in debugging builds, so there is at least one prominent case where the assumption is false. Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Martin Bonner wrote:
You may be able to do a little better. The standard says "The storage for objects with static storage duration (3.7.1) shall be zeroinitialized (8.5) before any other initialization takes place." What is more, most normal operating systems give programs pages of memory that are zero-initialized, so compilers would have to do work to avoid this.
I think that means you can declare constructors and STILL have the object well behaved.
Do you mean: the object might be already zero initialized before the ctor runs? If yes: this is very dangerous to rely on!
No. This is required. The memory for an object with static-storage-duration MUST be zero initialized prior to the constructor running.
1) "Might": the standard does not require it.
Wrong.
2) E.g.: MSVC initializes memory to "CDCDCDCDCDCD...." in debugging builds, so there is at least one prominent case where the assumption is false.
It does this for automatic and heap-allocated objects (which is allowed). If it did it for static objects, it would be non-conforming. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Hi, On 10/30/06, Anthony Williams <anthony_w.geo@yahoo.com> wrote:
Roland Schwarz <roland.schwarz@chello.at> writes: [snipped]
Do you mean: the object might be already zero initialized before the ctor runs? If yes: this is very dangerous to rely on!
No. This is required. The memory for an object with static-storage-duration MUST be zero initialized prior to the constructor running.
It dont even matter, it *must* not have a constructor. Or else there'll be a race condition on the compiler's flag and multiple constructor calls.
[snipped]
Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
-- Felipe Magno de Almeida

From: Felipe Magno de Almeida
On 10/30/06, Anthony Williams <anthony_w.geo@yahoo.com> wrote:
Roland Schwarz <roland.schwarz@chello.at> writes: [snipped]
Do you mean: the object might be already zero initialized before the ctor runs? If yes: this is very dangerous to rely on!
No. This is required. The memory for an object with static-storage-duration MUST be zero initialized prior to the constructor running.
It dont even matter, it *must* not have a constructor. Or else there'll be a race condition on the compiler's flag and multiple constructor calls.
Does this matter if all the constructor does is repeat the zero initialization? (Serious question. The answer is not obvious to me). Roland suggested a mutex which relied on the zero initialization of static PODs. He then listed a number of issues that maintaining POD status introduces (like we can't eliminate copy constructors). I pointed out that actually ALL static objects are zero initialized, so we could have a default constructor that does nothing or repeats the zero initialization (which means we could create the mutex other than as a object with other than static storage duration). -- Martin Bonner Martin.Bonner@Pitechnology.com Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB24 6WZ, ENGLAND Tel: +44 (0)1223 203894

"Martin Bonner" <martin.bonner@pitechnology.com> writes:
From: Felipe Magno de Almeida
On 10/30/06, Anthony Williams <anthony_w.geo@yahoo.com> wrote:
Roland Schwarz <roland.schwarz@chello.at> writes: [snipped]
Do you mean: the object might be already zero initialized before the ctor runs? If yes: this is very dangerous to rely on!
No. This is required. The memory for an object with static-storage-duration MUST be zero initialized prior to the constructor running.
It dont even matter, it *must* not have a constructor. Or else there'll be a race condition on the compiler's flag and multiple constructor calls.
Does this matter if all the constructor does is repeat the zero initialization? (Serious question. The answer is not obvious to me).
Roland suggested a mutex which relied on the zero initialization of static PODs. He then listed a number of issues that maintaining POD status introduces (like we can't eliminate copy constructors). I pointed out that actually ALL static objects are zero initialized, so we could have a default constructor that does nothing or repeats the zero initialization (which means we could create the mutex other than as a object with other than static storage duration).
There are two race conditions in effect on MSVC. One is the race to START the construction, and the other is the race between COMPLETING construction and USING the object. If the constructor does nothing or initializes the members to zero, then the consequence of the first race is irrelevant, except that if multiple threads run the constructor, then multiple destructor calls will be scheduled. The second race is more serious --- if using the object sets the members to anything other than their initial values, and then requires that they retain these values, then it is essential that the constructor is complete before any thread tries to use the object. Otherwise, the still-running constructor may overwrite the data structure, and reset it back to zero, and thus mess up the thread that tried to use the object. I posted some MSVC-specific code a short time ago that overcame these problems (see the thread on once init stuff). Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
I posted some MSVC-specific code a short time ago that overcame these problems (see the thread on once init stuff).
I remember those. You said that the solution however is highly compiler/platform dependent, didn't you? So if the mutex ctor cannot be hidden from the public user interface, how should we ever hope to be able to manage this in an compiler independent way? Any ideas? In a private discussion we had, I got the impression, that you were interested in the idea of a POD mutex. Do you still think this is an idea we should pursue? Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
I posted some MSVC-specific code a short time ago that overcame these problems (see the thread on once init stuff).
I remember those. You said that the solution however is highly compiler/platform dependent, didn't you?
Yes. I wouldn't want to commit to it working with anything other than MSVC. Of course, if someone can verify it with other compilers, that would be good.
So if the mutex ctor cannot be hidden from the public user interface, how should we ever hope to be able to manage this in an compiler independent way? Any ideas?
We could have a separate static mutex, which was POD and relied on zero-initialization, and a dynamic mutex, which wrapped the static mutex with a constructor.
In a private discussion we had, I got the impression, that you were interested in the idea of a POD mutex. Do you still think this is an idea we should pursue?
Yes. I think it is worth pursuing any and all avenues of finding a mutex which can safely be used with either static or dynamic storage duration. In discussions on the C++ Standards committee reflector, it seemed that people were leaning towards an "it just works" interface for std::mutex, which allows it to be declared with any storage duration, without an explicit initializer. This was predicated on the idea that the various library implementers each knew platform- and compiler- specific ways of doing this, so we should specify the most user-friendly interface. Therefore, I think that boost should try and provide this interface as far as we are able. POD mutexes is one way. Named mutexes is another. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
I think it is worth pursuing any and all avenues of finding a mutex which can safely be used with either static or dynamic storage duration.
Did you found time to try out my POD mutex? I would be very interested if you could comment on it. This mutex can indeed be safely used with any storage duration. The fact that it has to be initialized when used as a member of an object is not anything more strange than the requirement to initialize an integer before first use.
This was predicated on the idea that the various library implementers each knew platform- and compiler- specific ways of doing this, so we should specify the most user-friendly interface.
Not sure what this exactly means, can you exemplify?
Therefore, I think that boost should try and provide this interface as far as we are able. POD mutexes is one way. Named mutexes is another.
Which kind of named are you talking about here? I do not believe opsys-named is the correct way to go. If you think the simplest interface is doing away with the mutices altogether I am with you. scoped_lock lk(&foo); should be enough. Implementing this efficiently however is another story. Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
I think it is worth pursuing any and all avenues of finding a mutex which can safely be used with either static or dynamic storage duration.
Did you found time to try out my POD mutex? I would be very interested if you could comment on it.
Not yet, sorry.
This was predicated on the idea that the various library implementers each knew platform- and compiler- specific ways of doing this, so we should specify the most user-friendly interface.
Not sure what this exactly means, can you exemplify?
std::mutex global; void f() { static std::mutex local_static; std::mutex automatic; std::mutex* dynamic=new std::mutex; } class X { std::mutex class_data_member; X() {} }; X global_as_member; void g() { static X local_static_as_member; X automatic_as_member; X* dynamic_as_member=new X; } in no case is an initializer required. In all cases it should "just work", which requires platform-specific code.
Therefore, I think that boost should try and provide this interface as far as we are able. POD mutexes is one way. Named mutexes is another.
Which kind of named are you talking about here? I do not believe opsys-named is the correct way to go.
opsys-named is just one way of implementing named mutexes. Your global map is another.
If you think the simplest interface is doing away with the mutices altogether I am with you.
scoped_lock lk(&foo); should be enough. Implementing this efficiently however is another story.
No, that's not what I meant. The proposal(s) before the Standards committee are for an explicit mutex type. By "simplest interface" I meant not requiring any explicit initialization under any circumstances. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
Roland Schwarz <roland.schwarz@chello.at> writes:
Not sure what this exactly means, can you exemplify?
<examples of simple interface snipped> Ok, this was nearly obvious to me, I was asking about the "platform and compiler specific ways". I cannot however understand why one would insist on:
class X { std::mutex class_data_member;
X() {}
};
while still accepting the need to: class X { int data_member; X() : data_member(42) {} } What is wrong or "unsimple" about: clsss X { std::mutex class_data_member; X() : class_data_member(0) {} }
scoped_lock lk(&foo); should be enough. Implementing this efficiently however is another story.
No, that's not what I meant. The proposal(s) before the Standards committee are for an explicit mutex type. By "simplest interface" I meant not requiring any explicit initialization under any circumstances.
But scoped_lock lk(&my_foo); does "not require any explicit initialization under any circumstances". (With respect to mutex). This even beats the class_data_member case. You even can use it from inside global ctors. Btw.: This interface would also be backwards compatible to the "explicit mutex lovers": Simply define an pseudo mutex object of name mutex. Since it has an address it also will be usable as an argument to lock. We were always looking for sync primitives that really are C++ way. The most important new thing (with respect to C) is: scoped_lock ! This really is compelling. So why not take the full step and get rid of the "helper class" mutex too? The only really indispensable property of a mutex is its ID. Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote:
Roland Schwarz <roland.schwarz@chello.at> writes:
Not sure what this exactly means, can you exemplify?
<examples of simple interface snipped>
Ok, this was nearly obvious to me, I was asking about the "platform and compiler specific ways".
e.g. using named mutexes on Windows, using compiler attributes or pragmas to say "make this thread safe", using my MSVC-specific once-init trick, etc.
What is wrong or "unsimple" about:
clsss X { std::mutex class_data_member;
X() : class_data_member(0) {} }
IMO, the best state of affairs is where there is only one way of initializing something, which always works. Requiring explicit initialization only under some circumstances, provides an easy path for making errors.
scoped_lock lk(&foo); should be enough. Implementing this efficiently however is another story.
No, that's not what I meant. The proposal(s) before the Standards committee are for an explicit mutex type. By "simplest interface" I meant not requiring any explicit initialization under any circumstances.
But scoped_lock lk(&my_foo); does "not require any explicit initialization under any circumstances". (With respect to mutex). This even beats the class_data_member case. You even can use it from inside global ctors.
Btw.: This interface would also be backwards compatible to the "explicit mutex lovers": Simply define an pseudo mutex object of name mutex. Since it has an address it also will be usable as an argument to lock.
Yes. That's why I think it's an important avenue to examine --- if the mutex address is the only important aspect of the actual object, and all the state is held elsewhere, such an object can be used without initialization in all the cases I quoted.
We were always looking for sync primitives that really are C++ way. The most important new thing (with respect to C) is: scoped_lock ! This really is compelling. So why not take the full step and get rid of the "helper class" mutex too? The only really indispensable property of a mutex is its ID.
It helps to have an object you can refer to, and call member functions on. It's also closest to existing practice. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams wrote:
No. This is required. The memory for an object with static-storage-duration MUST be zero initialized prior to the constructor running.
1) "Might": the standard does not require it.
Wrong.
You are obviously misunderstanding me. Of course are static objects zero initialized. I had the impression that the original poster meant that this would be a general case for dynamic objects as well on some compilers. And this is waht I meant by "might". Did you get me now?
2) E.g.: MSVC initializes memory to "CDCDCDCDCDCD...." in debugging builds, so there is at least one prominent case where the assumption is false.
It does this for automatic and heap-allocated objects (which is allowed). If it did it for static objects, it would be non-conforming.
Same than before. But the problem is a different one: Even a static zero initialized ähm memory area, is not a object until its ctor has been run! And since my mutex proposal tries to implement a mutex that is valid before ctor runs have been completed, only POD's can be used. Does this make sense? Roland

Roland Schwarz <roland.schwarz@chello.at> writes:
Anthony Williams wrote: Even a static zero initialized ähm memory area, is not a object until its ctor has been run! And since my mutex proposal tries to implement a mutex that is valid before ctor runs have been completed, only POD's can be used.
Does this make sense?
Yes. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

From: Roland Schwarz
Martin Bonner wrote:
You may be able to do a little better. The standard says "The storage for objects with static storage duration (3.7.1) shall be zeroinitialized (8.5) before any other initialization takes place." What is more, most normal operating systems give programs pages of memory that are zero-initialized, so compilers would have to do work to avoid this.
Sorry, I do not understand "so compilers would have to do work to avoid this". Could you please try to state it with different wording?
It is a requirement of the standard that static objects are zero initialized before the constructor runs. Compilers sometimes fail to implement the standard for various reasons (backwards compatibility, bugs, etc). What I am saying is that the natural way of implementing static objects on most implementations will automatically meet this requirement of the standard - so it is fairly safe to rely on.
I think that means you can declare constructors and STILL have the object well behaved.
Do you mean: the object might be already zero initialized before the ctor runs? If yes: this is very dangerous to rely on!
1) "Might": the standard does not require it.
But the standard DOES require it for STATIC objects (which is what you were worrying about the order of initialization for).
2) E.g.: MSVC initializes memory to "CDCDCDCDCDCD...." in debugging builds, so there is at least one prominent case where the assumption is false.
It only initializes heap memory to CD - not "objects with static storage duration". -- Martin Bonner Martin.Bonner@Pitechnology.com Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB24 6WZ, ENGLAND Tel: +44 (0)1223 203894

Martin Bonner wrote:
It is a requirement of the standard that static objects are zero initialized before the constructor runs. Compilers sometimes fail to implement the standard for various reasons (backwards compatibility, bugs, etc). What I am saying is that the natural way of implementing static objects on most implementations will automatically meet this requirement of the standard - so it is fairly safe to rely on.
Ok. I think in the meantime I understood what you wanted to tell me. Sorry, that I didn't understand you on the first reading. However designing an object in this manner still sounds very odd to me. Indeed we have two different states of this so called object: 1) The object is kind of an aggregate (before the ctor has been run). Of course it may be accessed at this stage if properly designed, but it still is dangerous because a user might think it already _is_ an object. 2) The object got live (ctor has been run). Now if it happens, that the object gets onto the stack, the user must not forget to initialize it! She must not use the ctor in this case of course. And this looks very dangerous to me, since I normally would expect a ctor to properly initialize the object! (Why would one ever put a mutex on the stack? Obviously not directly, but this can happen all to easily when embedded in an object.) For this reason I believe letting the programmer know that the mutex is POD, is far more safe, since this is something common. You always need to specify initializations for POD's. Beeing not able to prevent the copy ctor I think is not such a big problem at all. One already is used to think about this when having pointers in the object. Perhaps it would even be better to make the pointer characteristic of mutex explicit to alert the user? Roland

"Roland Schwarz" <roland.schwarz@chello.at> wrote in message news:4544CC77.9060009@chello.at...
Triggered by the question how to do static initialization of a mutex, I tried to write one that is.
[...] FWIW, here is a fairly useful technique for augmenting some POSIX mutex implementations' with dynamic initialization and full POSIX safety characteristics: http://groups.google.com/group/comp.programming.threads/browse_frm/thread/f2... I thought you may be interested...

"Roland Schwarz" <roland.schwarz@chello.at> wrote in message news:4544CC77.9060009@chello.at...
Triggered by the question how to do static initialization of a mutex, I tried to write one that is.
Here is something you can try... pseudo-code ----------------- namespace atomic { template<typename T> class var; template<typename T> class once; namespace mb { enum naked_e { naked }; enum fence_e { fence }; enum release_e { release }; enum depends_e { depends }; } namespace externlib_api { class hashed_mutex; template<typename T> extern T load(T volatile*, mb::naked_e) const throw(); template<typename T> extern T load(T volatile*, mb::depends_e) const throw(); template<typename T> extern void store(T volatile*, T const&, mb::release_e) throw(); template<typename T> extern void store(T volatile*, T const&, mb::naked_e) throw(); } template<typename T> class var { mutable T volatile m_state; public: var() throw() {} var(T const &state) throw() : m_state(state) {} public: inline T load(mb::depends_e) const throw() { return externlib_api::load(&m_state, mb::depends); } inline T load(mb::naked_e) const throw() { return externlib_api::load(&m_state, mb::naked); } public: inline void store(T const &xchg, mb::fence_e) throw() { externlib_api::store(&m_state, xchg, mb::fence); } inline void store(T const &xchg, mb::naked_e) throw() { externlib_api::store(&m_state, xchg, mb::naked); } }; template<typename T> class once { var<T> m_state; public: once() throw() : m_state(0) {} ~once() throw() { T *local = m_state.load(mb::naked); if (local) { // assume this is statically out of scope try { delete local; } catch(...) { assert(false); throw; } } } public: // atomic load / thread-saftey: strong T* load() const { T *local = m_state.load(mb::depends); if (! local) { externlib_api::hashed_mutex::guard_t const &lock(this); local = m_state.load(mb::naked); if (! local) { local = new T; m_state.store(local, mb::release); } } return local; } }; } // usage... static atomic::once<foo> g_foo; void some_threads(...) { for(;;) { // whatever... foo *myfoo = g_foo.load(); } } Any thoughts?

Chris Thomasson wrote:
// usage... static atomic::once<foo> g_foo;
void some_threads(...) { for(;;) { // whatever...
foo *myfoo = g_foo.load(); } }
Any thoughts?
This looks similar in usage to me as my proposal of "thread safe initialization of local static and global objects" in http://lists.boost.org/Archives/boost/2006/10/111221.php In this discussion we came to the conclusion that the problem with this scheme is not the construction, but the destruction. When will the destructor of this object being run? The object is kind of split brain: On the one hand it is a dynamic object that lives on the heap, on the other the programmer thinks of the object as being static. This obviously affects live-time very differently. (In the discussion we recognized this because we were not able to tell when the atexit handler should destroy te object.) This is why I abandoned the idea, because using a statically initializeable mutex is a much more natural thing to comprehend for the programmer. void bar { static mutex; lock lk(mutex); static myclass m(42); lk.unlock(); } I glanced over your code snippet and saw that you have a dtor in once. Unfortunately the comment "assume this is statically out of scope" does sound very cryptic to me. Would it be possible for you to try to explain which problem exactly your code tries to solve, and how? Thank you Roland

"Roland Schwarz" <roland.schwarz@chello.at> wrote in message news:45463A2F.8030209@chello.at...
Chris Thomasson wrote:
[...]
In this discussion we came to the conclusion that the problem with this scheme is not the construction, but the destruction.
When will the destructor of this object being run?
Well, I guess you can do something like this: First of all, imagine I added a cas function to atomic::var... template<typename T> class once { // I forgot to make this a ptr in the pseudo-code! Argh... var<T*> m_state; // we can add a counter... var<intword_t> m_count; private: bool try_dec() throw() { intword_t local; do { local = m_count.load(mb::naked); if (! local || local < 1) { return false; } // use mb::fence to cover *both acquire and release // wrt the result of the decrement } while(! m_count.cas(local, local - 1, mb::fence)); return (local == 1); } bool try_inc() throw() { intword_t local; do { local = m_count.load(mb::naked); if (! local || local < 1) { return false; } } while(! m_count.cas(local, local + 1, mb::acquire)); return true; } public: once() throw() : m_state(0), m_count(1) {} ~once() throw() { if (try_dec()) { try { delete local; } catch(...) { assert(false); throw; } } } public: // atomic load / thread-saftey: strong T* load() const { T *local = m_state.load(mb::depends); if (! local) { externlib_api::hashed_mutex::guard_t const &lock(this); if (! try_inc()) { return 0; } local = m_state.load(mb::naked); if (! local) { // call try_dec on exceptions... local = new T; m_state.store(local, mb::release); } } else if(! try_inc()) { return 0; } return local; } // dec the refcount void dec() { if (try_dec()) { delete local; } } }; * : http://groups.google.com/group/comp.programming.threads/browse_thread/thread... What do you think?

here is sample usage now: static atomic::once<foo> g_foo; void some_threads(...) { for(;;) { // whatever... foo *myfoo = g_foo.load(); if (myfoo) { // use myfoo... // okay, we are finished with myfoo g_foo.dec(); } } }

the decrement predicates
bool try_dec() throw() { intword_t local; do { local = m_count.load(mb::naked);
if (! local || local < 1) { return false; }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// use mb::fence to cover *both acquire and release // wrt the result of the decrement } while(! m_count.cas(local, local - 1, mb::fence)); return (local == 1); }
could all look like this: if (local < 1) { return false; } I made a redundant assertion...
participants (6)
-
Anthony Williams
-
Chris Thomasson
-
Daniel Wesslén
-
Felipe Magno de Almeida
-
Martin Bonner
-
Roland Schwarz