Re: [Boost-users] boost::hash and string conflicts

7 Apr 2011

      On Thu, Apr 7, 2011 at 10:08, Erik Scorelle <escorelle.work@gmail.com> wrote:
...
We have been using boost hash to hash filenames, but have found with some of
our user data that certain strings will produce the same hash code (
"0012g6" and "0012fu" for example).  Is there a recommended way to predict
or resolve these sorts of conflicts?
Having some collisions is the expected behaviour of a
(non-cryptographic) hash function.  With a birthday search you can
easily find thousands of examples.

I'm not sure what you mean by "resolve".  Normally, code using hashes
expects collisions, and uses a full equality operator to resolve them.

If you cannot accept any collisions, you could use perfect hashing
(libraries like <http://cmph.sourceforge.net/> or
<http://www.gnu.org/software/gperf/gperf.html>) or use a cryptographic
hash (I have a usable, but WIP library at
<http://svn.boost.org/svn/boost/sandbox/hash/>).

~ Scott

Re: [Boost-users] boost::hash and string conflicts

Scott McMurray