
Helge Bahmann wrote: (I'm not going to reply to most of this because I've now forgotten most of what I learnt when I looked into it...)
- An inline spin lock is the only thing that doesn't involve a function call, so leaf functions remain leaf functions and are themselves more likely to be inlined or otherwise optimised. On systems with small caches or small flash chips where code size is important, this is a significant benefit.
I'm not sure I'm following here -- for small cache sizes, inlining is *not* preferrable, right?
Many C++ leaf functions are so trivial that they are smaller when inlined than when out-of-line, when you allow for the register-shuffling needed to get the arguments in the right places for the function call.
A home-grown futex-based implementation is of course valid and useful, but on most architectures it will not be faster, and when it is not, I fail to see why it would not be preferrable to fix the problems at the libc level instead.
I think that the main problem with pthread_mutex is that it has several features that unavoidably require extra space in the struct. This can't be fixed in libc without making it no longer a pthread_mutex. Phil.