
On Mon, Jan 26, 2015 at 10:51 PM, Gavin Lambert
On 27/01/2015 15:08, Niall Douglas wrote:
As I said, it's not a big difference (atomic ops are typically ~1us, and that was on the previous CPU generation), but it's still one of my pet peeves, as while there are many places where shared_ptrs do need to get copied for correctness, parameter passing is not one of those places. (And performance gets worse if you end up passing the object through many layers as part of keeping methods short or similar "tidiness" or abstraction guidelines; and it wastes more stack too.)
I got distracted by the ~1us estimate you gave here. I just wrote a quick benchmark for an uncontended fetch_add + compare and repeat, and came up with about 22 cycles total per iteration, which is about 7 ns per iteration. If I use a volatile int instead of an atomic, it is just over 2 ns per iteration. It's more expensive, but it seems to be less than an order of magnitude, rather than the 3 orders of magnitude mentioned above. Here's the code for posterity. #include <atomic> int main(int argc, char **args) { #if 1 std::atomic<int> count(1000000000); while (count.fetch_add(-1, std::memory_order_relaxed)); #else volatile int count = 1000000000; while (count--); #endif return 0; } Sorry to derail the discussion. Carry on.