
hi thomas, thanks for your formal review!
-- One inconvenience for me was the use of max-heaps instead of min-heaps. Well, one could argue that a priority_queue should return the elements with the highest priority first (but as Wikipedia says: "some conventions consider lower priorities to be higher"), and heap-sort uses a max-heap. Since these are the two application where stl uses a heap, it's clear why stl implements a max-heap. But the applications where it is beneficial to call increase or decrease directly normally use a min-heap. Now I have to call increase even so the key value actually has decreased...
there are two conventions to consider: min-heaps (which are typically used in textbooks) and max-heaps (which are used in the stl heap functions). i prefered to be consistent with the stl. likewise i am using the `top()' to extract the value with the highest priority, not `find_min()' like it is often used in the literature ...
-- I was left a bit with the feeling that I hadn't made the most efficient use of Boost.Heap.
the only way to find the best data structure is to test with a specific workload. boost.heap does not provide a `one-size-fits-all' solution, but a toolbox of different configurable data structures.
For example, I quickly found out that calling update just with a handle (as I had implemented it first) was suboptimal for fibonacci_heap.
suboptimal in what way?
- The performance numbers for Boost.Heap were neither bad nor outstanding. -- The numbers for b_heap are not convincing -- The numbers for fibonacci_heap seem to be the best. -- The pairing_heap has a smaller compare count than fibonacci_heap, but actually performs worse than fibonacci_heap in this test. Considering statements like "Experimental studies indicate that pairing heaps actually outperform Fibonacci heaps", I wonder whether the implementation of pairing_heap could be improved so that it actually outperforms fibonacci_heap.
for a performance comparison, one will have to consider * expenses of comparison * expenses of swapping data * size of the managed data * workload (mixture/order of API calls) i suppose for every data structure one can come up with a synthetic benchmark, so that this data structures outperforms every other implementation.
What is your evaluation of the design?
I guess it's OK. What makes me a bit nervous is the statement "The library is not targeted in any specific domain nor directly related to any specific libraries". A good interface often is the result from an abstraction over different applications. The other way (coming from the data structure/algorithm) risks bloating the interface with unnecessary functionality. For example, somebody asked during the review for "iteration over the heap in priority order". Is this really something that many applications requiring a heap will benefit from? I heavily doubt this.
testing two heaps for equivalence will benefit from it. of course you can ask, why do you want to test that ...
Do you think the library should be accepted as a Boost library?
No, the library should not be accepted unless - The current review either magically goes back on track and at least 3 other persons also submit a proper review, or another (mini?) review with proper participation is held later.
... i am a bit disappointed by the lack of reviews myself. given the fact that this is the only one that actually contains a vote and the review stoped yesterday i would appreciate other feedback
- Either a convincing example where b_heap outperforms the other heaps is presented, or it's use should be discouraged in the documentation.
http://article.gmane.org/gmane.comp.lib.boost.devel/220245
- The documentation says a bit more about the different heaps and their trade offs (amortized time <-> worst case), and generally tries to give recommendation to the user which heap will likely best fit his application, and which settings he should try (for example something like that arity<4> is a good compromise for a d_ary heap between cache locality and compare count).
a section of `best practices' may be helpful?
I used the negative form of "Yes, the library should be accepted under the following conditions", because I'm unhappy with how the review went. The announcement was sent out too late, without stating the dates of the review, much of the initial discussion was about the code collaborator tool, but I haven't seen any "real" review, despite that fact that there seems to be interest in the library. There were some discussions on Code Collaborator, and the tool was actually not bad in pointing to concrete problems in the actual code. Tim quickly addressed these, but a good review would have some discussions and actual opinions and suggestions.
well, i would appreciate more reviews myself ... however if you are the only person who includes a vote and this vote is no, boost.heap should be rejected. no matter if the reason is `documentation', `code', `performance' or `review process'.
It is also unclear to me how well Boost.Heap performs with respect to boost/graph/detail/d_ary_heap.hpp. I could have tried to also include this heap in my heap test/benchmark, but as the review seemed to have silently died already, I wasn't motivated enough for doing this.
iirc bgl heap data structures are pretty specific (optimized for integers). thanks for taking your time to review boost.heap. tim