
hi andrew and list,
Other comments received during the course of the review were generally positive, with a number of technical issues identified that need to to be addressed. The final version of the library can be fully accepted once Tim has addressed the technical issues identified during the review.
i will try to address the issues in the next few weeks and post an updated version when it is ready ...
- The performance of b_heap vs. d_ary_heap vs. priority_queue needs to be better characterized. Under what circumstances does one outperform the other. If no convincing scenario can be found where b_heap outperforms the others, then it should also be placed in an experimental directory.
i have been thinking about a `best practices' section, discussing the performance aspects in more detail.
- The heap concepts need to be revisited as per comments on the review tool. The concepts should also describe semantics of the operations, not just the interfaces.
ok
- There were several comments on the implementation of heap stability. Three alternatives were identified: “naturally stable” heaps (of which no examples were given), the current counter based approach, and a chaining approach. No consensus on the best implementation strategy. Although the counter-based approach is probably adequate for most applications, it is not universally desirable. A more thorough investigation of alternative strategies is encouraged.
in order to implement stability without a version count, we need to be able to find out, if a value is already inside the heap. since a heap is not a search tree the possible solutions are (a) iterate the complete heap (O(n)), (b) use a treap (mixture of heap and binary search tree, O(log(n))) or (c) maintain a separate hash table O(1). then nodes can be ordered in `natural stable' order or they could be chained. even compared with a hash table, the version count will probably be more time- efficient (no hashing required) and most likely more memory efficient ...
- The notion of Mergeable heaps needs to be better addressed, at least to differentiate between heaps that merge in O(n) or slower time and those that can merge in O(lg n) time. The proposed solution is to write merge as a general heap algorithm and specialize it for Mergeable heaps by calling the member function merge. Also, merge should be destructive, so merge and clear should not be needed.
more or less done
- The documentation must be substantially improved with respect to the overall design of heap data structures (i.e., concepts), especially mutable heaps and best practices. It also needs to provide both theoretical and performance comparisons for the different heaps.
already started to work on this
- Include documentation of the benchmarking tool in the documentation. It could be instructive for users to evaluate different heap implementations on their host or target architectures.
will give some more details about my benchmarking procedure. but like all synthetic benchmarks, they should be taken with a grain of salt ...
- Document the complexity of == and < for heaps.
ok
- Address the comments and issues in the Review Tool.
ok
- Phil Endecott suggested rewriting some heaps (b_heap and d_ary_heap) in terms of more general algorithms operating on a (Random Access) Range in order to support the formulation of heap structures on other memory structures. Doing this would also help support the definition of heaps that satisfied Thorsten's request.
i will have a look when i addressed the other issues. i would like to thank the reviewers for their feedback and the valuable discussions about the boost.heap library, andrew for running the review and google for supporting this project as part of the summer of code program. cheers, tim