El 17/08/2013 8:24, Marcello Pietrobon escribió:
I've noticed the same 10 times acceleration even while I make a skyPE call...
How can this have anything to do with interprocess programming?
After yield_k didn't offer good enough results, I decided to wrap the wait logic in a class instead of a function (called, spin_wait). This class would container the "k_" integer of yield_k and it would lazily obtain the value of the system tick, spining and yielding until that period has elapsed (using a high resolution counter or similar). I'm still finishing this class for Windows and then I need to write it for POSIX systems (and since MacOS does not support nanosleep, I maybe will need to do something special for this platform). However, in my first tests, I found that several applications change the default Windows tick period from 15,6 ms to 1ms (like just after launching Google Chrome). That's the reason why current Interprocess spinlocks run better when you start those applications: Sleep(1) was really sleeping for 1ms instead of 15ms (these values might change between different computers, I guess). In my first tests in my system (2,8Ghz Core i7), when the system tick is 1ms, an interprocess mutex needs 2700 iterations (32 nops/pauses + Sleep(0)) to wait for a tick. When the system tick is 15,6ms, it needs 41860 iterations (32 nops/pauses, + Sleep(0))). This means that no fixed value should be used to mark the yield/sleep limit, as it highly depends on the processor core and the system tick (that can be changed at any moment). I think N x (system tick time) limit could be a good guess. I don't know which N value is optimal to minimize both CPU usage and context switch overhead. We'd need to do some tests for that. In any case, I think this new approach will improve a lot current Interprocess horrible latencies. I'll ping the list when I commit a portable spin wait logic in a few days. Best, Ion