Re: [boost] Boost.Threads, N2178, N2184, et al

23 Mar 2007

      Anthony Williams wrote:
...
"Peter Dimov" <pdimov@mmltd.net> writes:
...
An interlocked_read is stronger ('ordered') and more expensive than
needed
on a hardware level, but is 'relaxed' on a compiler level under MSVC
7.1 (the optimizer moves code around it). It's 'ordered' for the
compiler as
well under 8.0; the intrinsics have been changed to be compiler
barriers as
well. InterlockedExchange is similar.
Have you got a reference for that? I would be interested to read
about the details; MSDN is sketchy.
The documentation for the intrinsics now states that they act as a compiler
barrier for 8.0.

http://msdn2.microsoft.com/en-us/library/1s26w950.aspx

The documentation that shipped with VC 7.1 did not, and in fact I have
observed the optimizer moving code across an interlocked intrinsic when I
developed the prototype of N2195.
...
...
A load_acquire can be implemented as a volatile read under 8.0, and a
volatile read followed by _ReadWriteBarrier under 7.1.
Why don't you need the barrier on 8.0? You need something there in
order to prevent the CPU from doing out-of-order reads (and stores),
even if the compiler won't reorder things. In fact, looking at the
assembly code generated, I believe you need more than a
_ReadWriteBarrier in both cases, as it seems to be purely a compiler
barrier, and not a CPU barrier.
On x86 all loads already have acquire semantics by default, and all stores
have release semantics. MSVC 8.0 extends a volatile load/store to have
acquire/release semantics (both hardware and compiler) on every platform,
including IA64.

http://msdn2.microsoft.com/en-us/library/12a04hfd.aspx