Re: [boost] Proposal: Monotonic Containers - Comparison with boost::pool, boost::fast_pool and TBB

22 Jun 2009

      AMDG

Christian Schladetsch wrote:
...
Hi Luke,
[...]
...
Luke> Peak memory will be a good metric.  Do you have access to VTune?  You
seem to struggle to identify the cause of performance loss and are reduced
to guesswork.
Peak memory measured by an external tool is no good, as boost::pool and
boost::fast_pool both leak memory. I need to be able to sample the memory
used at certain times in the application in a cross-platform way. I'll get
to this in due time.
I have been focused on getting the benchmark results more than attempting to
do a complete analysis of their implications. The latest results are here
http://tinyurl.com/lj6nab. I still can't explain why monotonic is faster at
sorting a 500,000 element pre-reserved vector, but I have only reported the
result and have not investigated deeply.
I have added mean, standard deviation, min and max factors for each of the
small, medium, and large benchmark sets. I print a cumulative total at the
end of each set, and a summary of all results at the end. These summaries
are:
GCC:
     scheme      mean   std-dev       min       max
      fast      36.3       173      0.25  1.63e+03
      pool      27.8  1.02e+04     0.857       897
       std      1.69      0.91     0.333         5
       tbb      1.59     0.849     0.333         5
MSVC:
    scheme      mean   std-dev       min       max
      fast      35.4       132     0.603 1.32e+003
      pool      27.1 1.13e+004     0.693       878
       std       2.7       1.7     0.628         7
       tbb      1.44     0.727     0.291       6.4
The mean is the average speedup factor provided by monotonic allocation over
the given scheme. So for MSVC, summarised over all tests, monotonic is 1.4X
faster than TBB with a standard deviation of 0.7. TBB was 3.4X faster at its
best and 6.4X slower at its worst.
Note that monotonic was on average 35X faster than boost::fast_pool, but
notice too that the standard deviation is very high. At its worst, fast_pool
was 1,300X slower than monotonic and at its best was 1.6X faster.
boost::pool faired little better, with an even worse standard deviation of
10,000(!). One could  argue that the tests are skewed, so I invite you to
look at them and suggest any changes or additions. See
http://tinyurl.com/l6vmgq for all the tests, and
http://tinyurl.com/l89llqfor the test harness.
It is no surprise that TBB performs best across both platforms with the
smallest standard deviation.
I'm not convinced that the average is meaningful.
boost::fast_pool_allocator is not intended to be used
with std::vector.  You're averaging many cases for
which it is documented to behave badly with a couple
of cases for which it is fine.  Also, even though pool_allocator
is supposed to work with std::vector, it is slow as I would
expect.  The pool data structure is really designed for
fixed size allocations and I for one am not particularly
enamored of the idea of using it for std::vector.

Also, this is completely unreleated, but for lines like this

      fast      1.49     0.838     0.603 1.32e+003

It looks like the cumulative min/max are being used instead of
the local min/max.

In Christ,
Steven Watanabe