Re: [boost] [boostcon12]Trouble with tuples very compiler dependent

19 Nov 2012


      On 11/19/12 10:19, Larry Evans wrote:
[snip]
...
In contrast to the gcc4_8 compiler, with the clangxx compiler, the
relative qualitative performance, is just the opposite.  IOW, the
bcon12_horizontal implementation is faster than the bcon12_vertical
implementation.  In fact, the rate of change of the performance
difference accelerates as tree depth goes from 2 to 4.  The rate of
change is so stark that it suggests, at least to me, there may be some
bug in clang.  Of course that conclusion is based on almost no
knowledge, on my part, of the clang implementation.
The tuple_benchmark_filt.py can be modified to filter out other parts
of the benchmark run output, which is here:
http://svn.boost.org/svn/boost/sandbox/variadic_templates/sandbox/slim/test/...
For example, when the filter criteria restricts
TUPLE_UNROLL_MAX to 10 (the same as TUPLE_SIZE),
then, with compiler=clangxx, bcon12_vertical
performs relatively better than bcon12_horizontal
as TREE_DEPTH increases, as shown in the attached.

-regards,
Larry