
Daniel James wrote:
Peter Dimov wrote:
Thorsten Ottosen wrote:
Peter, I don't like that hasing a string is suboptimal. Likewise, I would like hasing of sub_range<string> or sub_range<vector<char>> to be really fast.
See my other post. When talking about hash functions, you really need to have a clear understanding of "suboptimal" in mind. I don't claim to be an expert, but at least I have tried to familiarize myself with the research done in this area. My advice is: if you are going to mess with a (standard) hash function, you better (a) understand what you are doing, (b) have an extensive test suite that benchmarks an unordered_map with a wide range of real-world key sets.
Well, these two pages describe algorithms for hashing strings, which apparently are faster with good (or better?) quality (I haven't tested them yet):
http://burtleburtle.net/bob/hash/doobs.html http://www.azillionmonkeys.com/qed/hash.html
These pages are good. But as I said, benchmarking hash functions is pointless. Real programs do not compute the hash value of a random 256 byte buffer five million times. They usually need to do something with the hash value. This something typically involves a division on a prime to obtain a modulus, several pointer chasings, and at least one equality comparison (which on a 256 byte buffer will have considerable impact).