
On Mon, 21 Dec 2009, Chris M. Thomasson wrote:
object = m_buffer[i].exchange(object, memory_order_release);
if (object) { atomic_thread_fence(memory_order_acquire); }
return object;
[...]
T* pop() { T* object = m_buffer[m_tail].exchange(NULL, memory_order_acquire);
if (object) { m_tail = (m_tail == T_depth - 1) ? 0 : (m_tail + 1); }
return object;
you generally do not need "memory_order_acquire" when dereferencing an "atomically published" pointer -- "memory_order_consume" suffices to provide the proper ordering for all operations that are data-dependent on the pointer value (and any dereference obviously needs the value to compute the memory address to access). This is faster on any non-x86 and non-alpha by a fair amount. Regards, Helge