
hello bryce, i went through your implementation and the paper in detail and have a few comments: * dequeue should contain atomic<anchor> instead as anchor wrapping atomic<pair>, as it makes the code way more expressive. currently there is no way to see when/why/which operations are actually atomic ... * related to that, it is not really clear, which memory barriers are required where and why ... specifying them explicitly instead of using the default memory_order_seq_cst may be very useful in terms of performance * you can use compare_exchange_weak instead of compare_exchange_strong, if you check the result. * it might make sense to add padding between pool and anchor and in the node between left and right * // FIXME: Should we check both pointers here? ... both pointers are updated atomically ... so i do not think so .. * line 417/478 ... no need to increment the ABA tag: you change the pointer from a non-zero value to NULL, so no ABA problem can occur * line 440/501 ... no need to increment ABA tag here, either: you change r or l to prev, which cannot be identical, so no ABA problem can occur * afaict, the dequeue will only be lockfree on non-ancient x86-64 architectures, supporting 16byte CAS ... no ppc/x86 ... probably no mips/alpha. therefore a fixed-sized version of the algorithm will probably be very useful, using integer indices instead of pointers ... * tagged_ptr_pair could simply be implemented as std::pair<tagged_ptr>, of as struct {tagged_ptr left, right;}; no? that should make the implementation more generic and does not assume any platform specifics ... * an integration into the boost.lockfree testsuite would be nice ... cheers, tim