Re: [boost] [lockfree] Review

3 Aug 2011


      On Wednesday 03 August 2011 09:39:21 Grund, Holger wrote:
...
...
...
Efficient loads & stores are a bit tricky in that SSE2 is not a
requirement for 32-bit Windows. Without it, I think we need to resort
FILD/FISTP, which is a pain.
iirc, sse2 intrinsics are not guaranteed to be atomic, so sometimes
memory access has to be emulated via CAS.
All aligned 64-bit accesses are guaranteed to be atomic on x86. The same is
not true for 128-bit load and stores on x64 (at least there are no
architectural guarantees -- I think most (all) Intel & AMD implementations
still did in 2009)
I'm not really sure how you would implement a fully correct lock-free
atomic<int128_t> on x64. A cmpxchg16b requires the underlying page to be
writable.
if the page is not writable, then why would you need an atomic<int128_t> in 
the first place?

- if the data is unchanging, then is doesn't matter
- if the data is changing (through a writable mapping by someone else to the 
page), then you have some sort of producer-/consumer-problem and that is 
trivialley solvable with word-sized atomic operations

IMHO the same rationale holds for 64 bit atomics on 32 bit, so emulation via 
DCAS is acceptable -- since the lock prefix is needed anyway before 
cmpxchg8b/cmpxchg16b this should also deal with misalignment (even though 
this incurs a hefty performance penalty)

Helge

Re: [boost] [lockfree] Review

Helge Bahmann