Comments on functional/hash

2 Apr 2005

      First of all, nice work!

On headers:

* boost::hash defined in <boost/functional/hash.hpp>

Usually (when the number of top-level headers is small) we follow the rule 
that boost::X is defined in <boost/X.hpp>. This makes it much easier for 
users to remember what headers need to be included.

I think that hash.hpp needs to be a top-level header.

* hash/hash.hpp contains hash_range overloads for Range objects

I don't recall this being discussed during the review.

This introduces a dependency on the range headers. Given that std::pair is 
separated in its own header, this is a bit odd, especially when the range 
headers include <utility>.

But what's more important, these overloads are unnecessary. The proper 
protocol to obtain the hash value of an object x is hash_value( x ), not 
hash_range( x ). Consider the case std::vector<X>. If X does not support 
hash_value, hashing the vector will fail, even in the case where hash_range 
could have been used for X.

If a range object is not supported by hash_value, we just need to add a 
hash_value overload for it.

Summary: these overloads need to be removed.

* hash.hpp does not contain overloads for built-in arrays

This is a case that I missed when I prepared the hash_value proposal. We 
need to add them.

* boost::hash derived from std::unary_function

This introduces a completely unnecessary dependency on <functional>. 
boost::hash only needs to define result_type and argument_type.

* Header separation

Reviewers have expressed concerns that including all standard container 
headers would incur a significant hit. Is this the case in practice?

Currently, hash/hash.hpp results in 44247 lines, approx. 512K. The top-level 
hash.hpp results in 51096 lines, approx. 645K. In my opinion, this does not 
constitute an unacceptable increase.

In addition, defining all std-related overloads before hash_combine will 
eliminate the need for the call_hash workaround.

* The pointer overload for hash_value should probably be changed to take a 
void*

The reason for this is that on some platforms an int* and a void* to the 
same object may have different representations. On the other hand, this will 
disable the overload for function pointers. Maybe we need to leave the 
signature as-is but cast object pointers to void cv * internally. Or maybe 
the current implementation is good enough. Base* and Derived* to the same 
object may generally hash to different values anyway.

On tests:

* Tests do not test the specification

The tests are legitimate, but overly general. They test only the broad 
hash<>::operator() requirement, not a specific value. But our specification 
states precisely what hash value is to be expected (pointers and floats 
aside).

hash<X>( x ) needs to be tested against hash_value( x ), and hash_value( x ) 
needs to be tested against its expected value.

* custom_test

The portable definition of hash_value for custom objects is something along 
these lines:

namespace test
{

struct custom
{
#ifndef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP

    friend inline size_t hash_value { return ... }

#else

    size_t hash() const { return ... }

#endif
};

} // namespace test

#ifdef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP

namespace boost
{

size_t hash_value( test::custom const & x )
{
    return x.hash();
}

} // namespace boost

#endif

This is a bit more verbose but demonstrates the recommended approach for the 
two major cases, with or without argument dependent lookup.

Thanks for reading this far,

--
Peter Dimov
http://www.pdimov.com

Peter Dimov

Daniel James

Peter Dimov

John Maddock

tags

participants (3)