
well, benchmarking this can be tricky ... first, your work is boost::funtion<> (iirc 3 pointers) and a float ... so 16 bytes on 32bit or 28 bytes on 64bit ... on 32bit systems you will have a better matching with actual physical cache lines ... second ... synthetical benchmarks are not necessarily the best way to test cache-aware data structures: when stressing the heap, all its content will be located in the cpu caches ... so you won't necessarily see the benefit of the cache-awareness, since the data of the non-cache-aware data structure are still in the CPU cache. unless you manually add some work to thrash the caches ... in your case, you will have a maximum of 90000 items in the queue, on 32bit platforms this is rougly 1.4mb ... you won't see a big effect, if this fits all into the caches ...
The data cache sizes on the test systems are I think:
1. AMD Athlon II X2: 64 kB L1 + 1 MB L2. 2. VIA C7-M: 32 kB L1 + 128 kB L2. 3. ARM v5 (Marvell Kirkwood): 16 kB L1 + 256 kB L2.
on machine 2 and 3, i assume the size of a cacheline is 32/16 kb
So this data set really shouldn't fit in the cache. Even if it did, it would be great if the b_heap implementation performed no worse than the "naive" version.
it cannot: it uses a different tree structure than binary heaps, causing a higher tree. with d-ary heaps, the arity is a tradeoff between tree height and number of comparisons (higher arity, lower tree height but more comparisons). in a similar manner, the b-heap adds a tradeoff between memory locality and tree height ...
Also I've looked at your own benchmark graphs and it seems that there is a fairly constant performance difference between the different heaps - I don't see b_heaps getting faster than the alternatives as the queue size increases.
maybe because my workstation has 8mb of cache ... the
(BTW, I find it almost impossible to distinguish the colours in those graphs...)
gnuplot defaults ... i generally find it hard to visualize multidimensional data ... if the plots will be included in the final documentation, i will try to find a better visualization ... tim