
On Friday 21 October 2011 13:06:20 Tim Blechmann wrote:
shared memory support: the fallback implementation relies on the spinlock pool that also used by the smart pointers. however this pool is per-process, so the fallback implementation won't work in shared memory. can this be changed/fixed?
fixing this would require a per-variable lock... depending on the platform this can have enormous overheads.
I would suggest using the compile-time macros BOOST_ATOMIC_*_LOCK_FREE to pick an alternate code path.
then we need some kind of interprocess-specific atomic ... maybe as part of boost.interprocess ... iac, maybe we should provide an implementation which somehow matches the behavior of c++11 compilers ...
well if the atomics are truely atomic, then BOOST_ATOMIC_*_LOCK_FREE == 2 and I find a platform where you cannot use them safely between processes difficult to imagine (not that something like that could not exist) if they are not atomic, then you are going to hit a "fallback-via locking" path in whiche case you are almost certainly better off picking an interprocess communication mechanism that just uses locking directly
atomic::is_lock_free(): is_lock_free is set to either `true' or `false'. however in some cases, there are alignment constraints (iirc, 64bit atomics on ia32/x86_64 require a 64bit alignment). afaict there are not precautions to take care of this, are there?
for x86_64 there is nothing to do, ABI requires 8 byte alignment already
there used to be an __align__(8) to cover ia32, but it got lost... I *think* the "lock" prefix will cover this case nevertheless (at a hefty performance cost, though...)
i see
but you certainly have a point that this alignment should corrected, noted to be fixed
compile-time vs run-time dispatching: some instructions are not available on every CPU of a specific architecture. e.g. cmpxchg8b or cmpxchg16b are not available on all ia32/x86_64 cpus. i would appreciate if these instructions would not be used before performing a CPUID check, whether these instructions are really available (at least in a legacy mode)
the correct way to do that is to have different libraries for sub-architectures and have the runtime- linker decide... this requires infrastructure not present in boost
it would be equally correct to have something like: static bool has_cmpxchg16b = query_cpuid_for_cmpxchg16b()
if (has_cmpxchg16b) use_cmpxchg16b(); else use_fallback();
less bloat and prbly only a minor performance hit ;)
problematic because the compiler must insert a lock to ensure thread-safe initialization of the "static bool" (thus it is by definition not "lock-free" any more)
cmpxchg16b: currently cmpxchg16b doesn't seem to be supported. this instruction is required for some lock-free data structures (e.g. there is a dequeue algorithm, that requires a pair of tagged pointers).
could do, but cmpxchg16b is dog-slow, the fallback path is going to be faster anyways
in the average, but not in the worst case. for real-time systems it is not acceptable that the os preempts a real-time thread while it is holding a spinlock.
prio-inheriting mutexes are usually much faster than cmpxchg16b -- use these for hard real-time (changing the fallback path to use PI mutexes as well might even be something to consider) that being said, I can put it in, but I don't think there is value in it Best regards Helge