Re: [boost] Re: [boost.tr1?] request for a hash<> implementation

----- Mensaje original ----- De: Daniel James <daniel@calamity.org.uk> Fecha: Sábado, Febrero 5, 2005 5:36 pm Asunto: [boost] Re: [boost.tr1?] request for a hash<> implementation
John Maddock wrote:
Anyone else want to throw in their thoughts? What's the timescale on the unordered container development, how much is there left to do?
I'll upload a new version to the file vault later today, or maybe tomorrow, which should be compliant to the draft. I haven't written any documentation yet, I guess I'll probably just write an introduction and refer to the draft for reference (as Beman Dawes suggested), if that's okay.
There are a few areas which could do with some improvement, such as the hash functions, which are pretty basic. But I guess they can be left for later if you want to get it in soon.
Hi Daniel, What I'm interested in in the short term is the hash<> functor alone. Of course unordered containers are much more interesting, but this will have to undergo the usual review process, I guess. Do you think we can put your hash<> implementation under boost/functional/hash.hpp and have some short docs for it? Are you planning to improve the hashing algorithms (in particular for strings, which I guess it's the most improvable part)? If I could have this in time for Boost 1.33 I'd be extremely grateful, since Boost.MultiIndex hashed indices (practically finished now) need a hash<>, and I don't think it is convenient that each library ships with its own version. Thank you, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

JOAQUIN LOPEZ MU?Z wrote:
Do you think we can put your hash<> implementation under boost/functional/hash.hpp and have some short docs for it? Are you planning to improve the hashing algorithms (in particular for strings, which I guess it's the most improvable part)?
To be honest, I've barely looked at the hash functions so far. All I've done is change them to work on compilers without partial specialization. I'm planning to make improvements, but I'll have to do some research into the possible alogorithms first. If anyone has an opinions on this, I'll be very interested.
If I could have this in time for Boost 1.33 I'd be extremely grateful, since Boost.MultiIndex hashed indices (practically finished now) need a hash<>, and I don't think it is convenient that each library ships with its own version.
Certainly. I'll get something done this week. Daniel

Daniel James wrote:
JOAQUIN LOPEZ MU?Z wrote:
Do you think we can put your hash<> implementation under boost/functional/hash.hpp and have some short docs for it? Are you planning to improve the hashing algorithms (in particular for strings, which I guess it's the most improvable part)?
To be honest, I've barely looked at the hash functions so far. All I've done is change them to work on compilers without partial specialization. I'm planning to make improvements, but I'll have to do some research into the possible alogorithms first. If anyone has an opinions on this, I'll be very interested.
Take a look at http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1756.pdf issue 6.18.

Peter Dimov wrote:
Take a look at
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1756.pdf
issue 6.18.
It's even more important to provide this functionality in boost as the LWG obviously decided not yet to include these (IMO crucial) extensions into TR1: Rationale: The LWG agreed that there are usability problems with the current interface, but thought that this proposal was too large, and raised too many design issues, for it to be accepted for the TR. A library of hashing primitives might be a good candidate for TR2. Stefan

Stefan Slapeta wrote:
Peter Dimov wrote:
Take a look at
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2005/n1756.pdf
issue 6.18.
It's even more important to provide this functionality in boost as the LWG obviously decided not yet to include these (IMO crucial) extensions into TR1:
I agree, but should it be available in Boost.Tr1? I guess that I could provide two different versions: boost::hash, which implements Peter's version, and std::tr1::hash, which is a strict implementation of the standard, but uses boost::hash_value as an implementation detail (so there isn't much code duplication). Although if I do that, I should probably also provide two different versions of the unordered associative container - with different default hash functions. But that's pretty easy to do, just a bit messy. Daniel

Daniel James wrote:
I agree, but should it be available in Boost.Tr1? I guess that I could provide two different versions: boost::hash, which implements Peter's version, and std::tr1::hash, which is a strict implementation of the standard, but uses boost::hash_value as an implementation detail (so there isn't much code duplication).
The proposed enhancement of hash<> is supposed to be a conforming extension and something that I feel needs to be part of TR1.x. On the one hand, one might say that using our hash<> would make it hard for people to migrate to std::tr1, although boost::hash can still be used. But on the other hand, if we are to propose (again) an enhancement to tr1::hash, we need to have tested it in the field, or it wouldn't be accepted (again). Boost has always been about extensions.

The proposed enhancement of hash<> is supposed to be a conforming extension and something that I feel needs to be part of TR1.x.
On the one hand, one might say that using our hash<> would make it hard for people to migrate to std::tr1, although boost::hash can still be used.
But on the other hand, if we are to propose (again) an enhancement to tr1::hash, we need to have tested it in the field, or it wouldn't be accepted (again). Boost has always been about extensions.
Agreed, as long as we are testing proposed extensions, then making them part of boost::hash (and hence std::tr1::hash when we get a TR1 implementation up and running) is fine. If we want a library of string hash functions though, we'd better give them different names - there are after all many different ways of hashing a string. John.

John Maddock wrote:
If we want a library of string hash functions though, we'd better give them different names - there are after all many different ways of hashing a string.
There aren't _that_ many good ways to hash a sequence. If you deviate from *5/*31, hashpjw, and the shift/xor family there is substantial risk that the new and improved hash function leads to pathological cases where a hash_map loses to a map with a constant factor of three or more. For contiguous 8 bit characters there are fast alternatives that process four characters at a time but they don't generalize to arbitrary sequences. Cryptographic hash functions are obviously best from a quality standpoint, but they are too slow for a hash_map. All IMO, of course. :-)

JOAQUIN LOPEZ MU?Z wrote:
What I'm interested in in the short term is the hash<> functor alone. Of course unordered containers are much more interesting, but this will have to undergo the usual review process, I guess. Do you think we can put your hash<> implementation under boost/functional/hash.hpp and have some short docs for it?
I've just uploaded a new version. I've rewritten the hash functions based on Peter's design. I'll write some documentation and some more tests soon. Daniel
participants (5)
-
Daniel James
-
JOAQUIN LOPEZ MU?Z
-
John Maddock
-
Peter Dimov
-
Stefan Slapeta