
First of all, nice work! On headers: * boost::hash defined in <boost/functional/hash.hpp> Usually (when the number of top-level headers is small) we follow the rule that boost::X is defined in <boost/X.hpp>. This makes it much easier for users to remember what headers need to be included. I think that hash.hpp needs to be a top-level header. * hash/hash.hpp contains hash_range overloads for Range objects I don't recall this being discussed during the review. This introduces a dependency on the range headers. Given that std::pair is separated in its own header, this is a bit odd, especially when the range headers include <utility>. But what's more important, these overloads are unnecessary. The proper protocol to obtain the hash value of an object x is hash_value( x ), not hash_range( x ). Consider the case std::vector<X>. If X does not support hash_value, hashing the vector will fail, even in the case where hash_range could have been used for X. If a range object is not supported by hash_value, we just need to add a hash_value overload for it. Summary: these overloads need to be removed. * hash.hpp does not contain overloads for built-in arrays This is a case that I missed when I prepared the hash_value proposal. We need to add them. * boost::hash derived from std::unary_function This introduces a completely unnecessary dependency on <functional>. boost::hash only needs to define result_type and argument_type. * Header separation Reviewers have expressed concerns that including all standard container headers would incur a significant hit. Is this the case in practice? Currently, hash/hash.hpp results in 44247 lines, approx. 512K. The top-level hash.hpp results in 51096 lines, approx. 645K. In my opinion, this does not constitute an unacceptable increase. In addition, defining all std-related overloads before hash_combine will eliminate the need for the call_hash workaround. * The pointer overload for hash_value should probably be changed to take a void* The reason for this is that on some platforms an int* and a void* to the same object may have different representations. On the other hand, this will disable the overload for function pointers. Maybe we need to leave the signature as-is but cast object pointers to void cv * internally. Or maybe the current implementation is good enough. Base* and Derived* to the same object may generally hash to different values anyway. On tests: * Tests do not test the specification The tests are legitimate, but overly general. They test only the broad hash<>::operator() requirement, not a specific value. But our specification states precisely what hash value is to be expected (pointers and floats aside). hash<X>( x ) needs to be tested against hash_value( x ), and hash_value( x ) needs to be tested against its expected value. * custom_test The portable definition of hash_value for custom objects is something along these lines: namespace test { struct custom { #ifndef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP friend inline size_t hash_value { return ... } #else size_t hash() const { return ... } #endif }; } // namespace test #ifdef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP namespace boost { size_t hash_value( test::custom const & x ) { return x.hash(); } } // namespace boost #endif This is a bit more verbose but demonstrates the recommended approach for the two major cases, with or without argument dependent lookup. Thanks for reading this far, -- Peter Dimov http://www.pdimov.com

Peter Dimov wrote:
First of all, nice work!
Thanks. I was planning on asking for a post-review once I'd finished the documentation. I rushed the headers and tests into CVS early so that MultiIndex can use them, and I can get regression test results - otherwise I won't make the 1.33 release schedule. I mostly agree, or feel indifferent, about everything you wrote. So only a couple of quick responses.
* hash/hash.hpp contains hash_range overloads for Range objects
I don't recall this being discussed during the review.
It was, but very briefly. (At least, it somehow made its way into my notes).
This introduces a dependency on the range headers. Given that std::pair is separated in its own header, this is a bit odd, especially when the range headers include <utility>.
But what's more important, these overloads are unnecessary. The proper protocol to obtain the hash value of an object x is hash_value( x ), not hash_range( x ). Consider the case std::vector<X>. If X does not support hash_value, hashing the vector will fail, even in the case where hash_range could have been used for X.
If a range object is not supported by hash_value, we just need to add a hash_value overload for it.
Summary: these overloads need to be removed.
Well, I'll do this, unless anyone comes up with a good reason not to.
* boost::hash derived from std::unary_function
This introduces a completely unnecessary dependency on <functional>. boost::hash only needs to define result_type and argument_type.
I thought this was required by TR1. I'll happily remove it if isn't. Daniel

Daniel James wrote:
Peter Dimov wrote:
* boost::hash derived from std::unary_function
This introduces a completely unnecessary dependency on <functional>. boost::hash only needs to define result_type and argument_type.
I thought this was required by TR1. I'll happily remove it if isn't.
It's required; I just don't agree with the requirement, it's overly strict. I prefer to view the standard wording as a shorthand notation for the typedefs. User code should never depend on the inheritance relationship between a function object and std::*nary_function... and neither should our tests. (There are other defects in TR1 along the same lines, but it's far too late now to fix them.)

* boost::hash derived from std::unary_function
This introduces a completely unnecessary dependency on <functional>. boost::hash only needs to define result_type and argument_type.
As far as the TR is concerned it's a requirement for std::tr1::hash to derive from std::unary_function, there's no leeway permitted. John.
participants (3)
-
Daniel James
-
John Maddock
-
Peter Dimov