
Theoretically, the CAS should atomically compare and swap the value in one clock cycle. However, with multiple cores/processors/hyper threading where multiple instructions are being executed simultaneously over arbitrary numbers of clock cycles. There can be writes pending while you want to read from memory. As a result, when you go to read something another process will have written to but you read stale data. To combat this, you enforce a memory barrier, which guarantees that all pending memory transactions before the barrier have completed before moving on with the program. Additionally, some architectures (like x86) allow for unaligned access of memory. When an unaligned value is accessed, it sets an exception then it replaces the single read/write operation with multiple bus operations which wreaks havoc on any compare/swap operations. On Thu, Apr 17, 2008 at 4:47 PM, Michael Dickey <mike@mikedickey.com> wrote:
Yes, I'm using a multi-core machine. Where in my code would I need to add barriers, or is this perhaps a limitation in APR's CAS?
Thanks, -Mike
On Apr 17, 2008, at 3:45 PM, Patrick Twohig wrote:
Are you running it on a multicore machine or single core? That radically changes things. I was having the same problem. To test I ran 500 threads or so a third of them were producers. My suspicion was that I wasn't enforcing the proper memory barriers because for x86 I had to write my own CAS in assembly.
On Thu, Apr 17, 2008 at 3:23 PM, Michael Dickey <mike@mikedickey.com> wrote:
I wrote a C++ lock free queue implementation recently based on the Michael & Scott algorithm, that uses APR's atomics. It's available at:
http://svn.atomiclabs.com/pion-common/trunk/common/include/pion/PionLockFree...
You're welcome to use it but you should be aware that it's not quite debugged yet. It has issues when I tried using more than 1 producer and 1 consumer thread. I wasn't sure if this was a mistake I made in the code, or a limitation of the algorithm, but I assume it's the former. Anyway, in the end I decided not to use the code for what I was working on... But if you can find and fix the bug, it's all yours! =)
Of course, it depends on APR's atomics.. Boost does not have any compare-and-swap abstraction as I far as I know, which seems to be a basic requirement for any lock-free programming. Or maybe it does and I just don't know?
Take care, -Mike
On Apr 17, 2008, at 12:57 PM, Cory Nelson wrote:
On Wed, Apr 16, 2008 at 9:58 PM, Patrick Twohig <p-twohig@ieee.org> wrote:
I've been looking through the docs and I was wondering if boost has any lock-free threadsafe containers. I know it's been brought up on the list before, but I was curious if anybody had a working implementation. If somebody is working on them, I'd like to lend a hand if possible.
I have some lock-free stack/queue code for x86/x64/ia64 in c++. It only works with vc++ 2008 intrinsics right now, and the x64 version is dependant on the windows memory manager if used on older CPUs. If people are interested in boostifying and porting it, I suppose we could do that. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost