Re: [boost] [review] [heap] Summary

19 Jun 2011

      hi andrew and list,
...
Other comments received during the course of the review were generally
positive, with a number of technical issues identified that need to to
be addressed. The final version of the library can be fully accepted
once Tim has addressed the technical issues identified during the
review.
i will try to address the issues in the next few weeks and post an updated 
version when it is ready ...
...
- The performance of b_heap vs. d_ary_heap vs. priority_queue needs to
be better characterized. Under what circumstances does one outperform
the other. If no convincing scenario can be found where b_heap
outperforms the others, then it should also be placed in an
experimental directory.
i have been thinking about a `best practices' section, discussing the 
performance aspects in more detail.
...
- The heap concepts need to be revisited as per comments on the review
tool. The concepts should also describe semantics of the operations,
not just the interfaces.
ok
...
- There were several comments on the implementation of heap stability.
Three alternatives were identified: “naturally stable” heaps (of which
no examples were given), the current counter based approach, and a
chaining approach. No consensus on the best implementation strategy.
Although the counter-based approach is probably adequate for most
applications, it is not universally desirable. A more thorough
investigation of alternative strategies is encouraged.
in order to implement stability without a version count, we need to be able to 
find out, if a value is already inside the heap. since a heap is not a search 
tree the possible solutions are (a) iterate the complete heap (O(n)), (b) use a 
treap (mixture of heap and binary search tree, O(log(n))) or (c) maintain a 
separate hash table O(1). then nodes can be ordered in `natural stable' order or 
they could be chained.
even compared with a hash table, the version count will probably be more time-
efficient (no hashing required) and most likely more memory efficient ...
...
- The notion of Mergeable heaps needs to be better addressed, at least
to differentiate between heaps that merge in O(n) or slower time and
those that can merge in O(lg n) time. The proposed solution is to
write merge as a general heap algorithm and specialize it for
Mergeable heaps by calling the member function merge. Also, merge
should be destructive, so merge and clear should not be needed.
more or less done
...
- The documentation must be substantially improved with respect to the
overall design of heap data structures (i.e., concepts), especially
mutable heaps and best practices. It also needs to provide both
theoretical and performance comparisons for the different heaps.
already started to work on this
...
- Include documentation of the benchmarking tool in the documentation.
It could be instructive for users to evaluate different heap
implementations on their host or target architectures.
will give some more details about my benchmarking procedure. but like all 
synthetic benchmarks, they should be taken with a grain of salt ...
...
- Document the complexity of == and < for heaps.
ok
...
- Address the comments and issues in the Review Tool.
ok
...
- Phil Endecott suggested rewriting some heaps (b_heap and d_ary_heap)
in terms of more general algorithms operating on a (Random Access)
Range in order to support the formulation of heap structures on other
memory structures. Doing this would also help support the definition
of heaps that satisfied Thorsten's request.
i will have a look when i addressed the other issues.

i would like to thank the reviewers for their feedback and the valuable 
discussions about the boost.heap library, andrew for running the review and 
google for supporting this project as part of the summer of code program.

cheers, tim

Re: [boost] [review] [heap] Summary

Tim Blechmann