Re: [boost] [lockfree::fifo] Review

21 Dec 2009

      "Helge Bahmann" <hcb@chaoticmind.net> wrote in message 
news:alpine.DEB.1.10.0912211548590.31425@m65s28.vlinux.de...
...
On Mon, 21 Dec 2009, Chris M. Thomasson wrote:
...
object = m_buffer[i].exchange(object, memory_order_release);
if (object)
      {
          atomic_thread_fence(memory_order_acquire);
      }
return object;
[...]
...
T* pop()
  {
      T* object = m_buffer[m_tail].exchange(NULL, memory_order_acquire);
if (object)
      {
          m_tail = (m_tail == T_depth - 1) ? 0 : (m_tail + 1);
      }
return object;
you generally do not need "memory_order_acquire" when dereferencing an 
"atomically published" pointer -- "memory_order_consume" suffices to 
provide the proper ordering for all operations that are data-dependent on 
the pointer value (and any dereference obviously needs the value to 
compute the memory address to access).
This is faster on any non-x86 and non-alpha by a fair amount.
Of course you are right. For some reason, I was thinking that 
`memory_order_consume' would boil down to a:

MEMBAR #LoadLoad

on a SPARC. The name was confusing me. Perhaps it should be named 
`memory_order_depends' or something... BTW, where is `memory_order_produce'?

;^)

I don't think I can use C++0x memory ordering to achieve simple #LoadLoad 
and #StoreStore barriers without the #LoadStore constraint. Am I right?