
Alexander Terekhov wrote:
Peter Dimov wrote: [...]
xadd is branchless; it just returns the old value, whereas inc doesn't. MSVC always generates lock xadd, even for _InterlockedIncrement, BTW.
Well, maybe. But you need value neither for increments nor decrements. (I mean that for decrements you can simply rely on ZF flag). Oder?
I don't need the value, but it doesn't matter; it's just as fast. Or just as slow. x86 isn't very predictable (I don't know how familiar you are with it); some instructions look better on paper but are (or were) actually slower (on 586 and above) than an equivalent RISC-style read/modify/write sequence, inc included. And of course the rules change with every generation (in some cases, sub-generation) of CPUs. Basically, the only way to know which is faster is to measure it.
Nah. For the sake of killing C/C++ volatiles rather sooner than later, I strongly suggest that you hide that load in asm. Just add load+cmp followed by ZF branch prior to "lock dec" which also sets ZF, IIRC.
Eh, __asm__ is no better than a volatile. Both are non-portable. Yes, dec sets ZF (and SF). That's why InterlockedDecrement on 386 returns +1, 0 or -1 instead of the actual value. :-)