
Tobias Schwinger wrote:
I am missing a way to parametrize how hash values are combined. This is especially desirable because the skeleton could then be exploited to implement double hashing for compound data (a comon technique to avoid collisions by varying the hash function) and can be useful in other situations (e.g. a small hash table in memory vs. a large one on disk).
I think if you want to combine hash values in a different manner, it's best just write your own version of hash_combine. It could save a little code if we supplied a version of hash_range that works for different combining functions, but I don't think it'll be that useful (and would require that your hash_combine function had the same signature as ours, you might want to do it differently).
- What is your evaluation of the documentation?
Good. Easily understandable.
Nitpicking: a hash table is not required to store its elements in buckets in general - this is a property of the most common way to implement hash tables.
Good point. I've been spending too much time looking at the draft standard for hash based containers (which require buckets).
- What is your evaluation of the implementation?
Here are my notes from reading the source:
The implementation of hash_combine does not need conditional jumps as a probably slightly slower variant, which I've seen quite often does. I know this variant works empirically well in practice, so will this one, I figure.
However, I believe that hashing speed is generally overrated and that there are numerous situations where it's worth to trade some of it for fewer collisions (as memory access generally gives more room for performance optimization than CPU instructions do).
I do need to spend more time looking into alternative hashing algorithms for a future version.
One idea that immediately came to my mind was some "syntactic sugar" to ease hashing of class members:
std::size_t hash_value(point const& p) { return (boost::hash_value = p.x, p.y, p.z); }
Note: It's just a spontanous idea I am not convinced of myself...
A hash function for boost::tuple would be a good way to do this, although something along those lines might be more efficient.
The for loop in hash_float_value is a nice one to be unrolled via metaprogramming.
Yes, for anyone who hasn't looked at the code it loops a fixed number of times based on the accuracy of the float type. Compilers could unroll that loop themselves, but that's probably too much to expect.
- Do you think the library should be accepted as a Boost library?
Yes. Though there should be some research if it is possible to add to its genericity as mentioned in the first paragraph (just saw the latest discussions and seems we're already at it).
Thanks for your review. Daniel