Re: [boost] [Heaps] benchmarks

9 Jun 2011

      Tim Blechmann wrote:
...
...
Tim, you're proposing this b-heap purely as a performance optimisation
compared to the "traditional" heap, but we've not yet seen any cases
where it is actually faster.  If it can be faster we need to see some
examples, and the documentation should qualify when it can be expected
to be better.  If it's not faster, it should be dropped.
just for the record:
core i7 920, x86_64, linux-3.0-rc2, hyperthreading/frequency scaling disabled,
12gb triple-channel ddr3 ram
stressed with:
stress -m 1 --vm-bytes 4G --io 16 -c 3
and two boinc processes running at nice 10
configured the problem size to 15000000
changed work_t from boost::function to int (24byte to 4byte)
comparison of b-heap and dary_heap (with arity 2) ... running all at nice 1 to
increase the number of context switches.
Performance counter stats for './bench_boostbheap':
[snip]
     615.223322417 seconds time elapsed
Performance counter stats for './bench_boostdary':
[snip]
     659.415916004 seconds time elapsed
That's great.  But I was comparing with your priority_queue<> and the 
std::push|pop_heap, not your d_ary_heap<>.  Could you do the same test 
with one of those?
...
Performance counter stats for './bench_boostbheap':
1,421,230,219,856 cycles
  928,444,569,746 stalled-cycles-frontend
  643,217,557,503 stalled-cycles-backend
1,517,920,671,665 instructions
...
Performance counter stats for './bench_boostdary':
1,536,548,826,172 cycles
1,304,937,894,945 stalled-cycles-frontend
  999,244,398,224 stalled-cycles-backend
  747,525,733,877 instructions
That's the interesting part: the b-heap takes twice as many 
instructions as the d-array heap, so the improvement in cache locality 
needs to be substantial enough to double the instruction throughput 
before the b-heap does better.  You've had to pull all the stops out to 
make that happen.

Regards,  Phil.

Re: [boost] [Heaps] benchmarks

Phil Endecott