
Alexander Terekhov wrote:
and
asm long atomic_decrement_strong( register long * pw ) {
loop1:
<load-reserved> <add -1> <branch if zero to acquire> {lw}sync
loop:
<store-conditional> <branch if !failed to done>
loop2:
<load-reserved> <add -1> <branch if !zero to loop>
acquire:
<store-conditional> <branch if failed to loop>
... to either loop1 or loop2 ...
isync
done:
<...> }
but it's either suboptimal (more than one sync) or incorrect (missing sync), I think. It needs a state machine. loop0: lwarx add -1 beq acquire-without-sync sync loop1: stwcx. beq+ done loop2: lwarx add -1 bne loop1 acquire-with-sync: stwcx. bne- loop2 isync blr acquire-without-sync: stwcx. bne- loop0 isync done: blr or something like that. Post-release. The current code is good (and risky) enough. :-)