
David Abrahams wrote:
Alberto Barbati <abarbati@iaanus.com> writes:
Hi,
I have a doubt about the proposed implementation of hash_value() for pointers, which is:
template <class T> inline std::size_t hash_value(T* v) { return static_cast<std::size_t>(v - (T*)0); }
this code calls for undefined behaviour according to ยง5.7/6, each time v is a non-null pointer.
Furthermore, you can't subtract addresses that aren't in the same array. 0 isn't in the same array as anything.
So we agree that this code has to be fixed in some way. I don't think Boost could accept the code as it is.
Unfortunately, there's one more thing... even if p == q, reinterpret_cast<uintmax_t>(p) is not guaranteed to be equal to reinterpret_cast<uintmax_t>(q) on implementations that have multiple internal representations for the same address. I don't think much can be done (in a portable way) for such platforms.
Once you're into reinterpret_cast you've left the land of strict portability anyway.
I agree with that literally, but I don't fully understand what do you mean with such remark, so I will re-formulate my thought hoping to make it clearer. Probably we are saying the same thing. Apart from the very special case described above, using reinterpret_cast<> as I describe will provide a viable hash function. The fact that its values are implementation-defined is irrelevant, I think (but... see below). However, *in that special case above*, the basic requirement of hash functions "equal arguments yield the same result" is not satisfied so even reinterpret_cast<> does not provide a viable hash function. As the standard does not provide any other portable mean to map generic pointers to integers, there's *no way* we can obtain a hash for pointers in this case. Ah! And there's one more case in which we are helpless, although I'm curious to see an implementation that exploits such latitude: the wording used in the standard does not guarantee that the mapping from pointers to integers always produce the same value over time. In other words: char* p = new char; size_t i1 = reinterpret_cast<size_t>(p); size_t i2 = reinterpret_cast<size_t>(p); assert(i1 == i2); // ??? char* q1 = reinterpret_cast<char*>(i1); char* q2 = reinterpret_cast<char*>(i2); assert(p == q1 && q1 == q2); The standard guarantees that the second assert passes, but says nothing about the first assert. Coming back to the hash library, I think we have only three choices: 1) use reinterpret_cast<> as I proposed, but documenting that it will produce a viable hash function only if the underlying implementation satisfy certain requirements (each specific implementation /should/ document the behaviour of reinterpret_cast anyway, so the user may able to check) 2) provide the hash function (with reintepret_cast<>) only for those platforms that we are able to check in advance that will satisfy those requirements and make any use of such function produce a static assert for all other platforms 3) do not provide a hash function for pointers at all I would vote for 1, but 2 and 3 also make sense. Is there a fourth choice? Alberto