[Boost-users] lockfree::queue restrictions and performance on linux

28 Dec 2013

      Hello,

after quite a while I'm back to using Boost (and back to the list) and 
quite happy to see that it made significant progress since I last last 
used it.

One of the libraries that got my attentions is lockfree, the 
lockfree::queue in particular, since it almost completely meets my 
needs. Except that I've been wondering why is it that it places the 
requirement for the trivial destructor on the items it stores? I mean, 
that really reduces its usability. Is this something inherent to its 
design? I guess that this has been discussed during the review but a 
quick digging through the archives didn't come up anything on this. The 
reason for asking is that I have forced the queue to accept 
std::shared_ptr and this seems to work almost fine, except that there 
seem to be an extra reference left to the item last popped, but that's 
something I can live with.

Another thing was its performance - I ran a sort of quick benchmark 
(queue of shared_ptr's to item containing atomic<int> as a counter, 
items were popped from the queue and immediately pushed back after 
decrementing the counter) and its perfromance was next to stellar on OSX 
(with multiple consumers and producers something like forty times faster 
than std::queue with mutex) but very poor on Linux. While I can 
understand that locking runs fast on Linux due to its spinlock/futex 
implementation of mutexes, I have no explanation for the
poor performance of lockless::queue (with two producer/consumer threads 
std::queue with mutex was about 15 times faster and about 20 times 
slower than when run on OSX on the similar hardware). Did anybody else 
observe the same poor performance behavior? I used Boost 1.54 and tested 
with both gcc 4.8 and clang 3.5 and got similar results with both.

Regards,

Leon

[Boost-users] lockfree::queue restrictions and performance on linux

Leon Mlakar