Ovanes Markarian
On Thu, April 26, 2007 12:29, Joaquín Mª López Muñoz wrote:
Ovanes Markarian
writes:
[...]
Sorry for the long explanation and then the short idea, hope this is of interest to others...
FWIW, I think your explanation of how boost::hash works with strings, char *s and char []s is correct. [...]
This puzzles me a lot: Given that your types_map container is indexed on a std::string, things should be the other way around: it is the "find(name)" version that should work AFAICS. Could you please double-check?
I double checked it and both give the correct hash value. I think this is string dependent issue, where the string uses COW idiom to save performance, Herb Sutter wrote about it in his Guru of the Week (http://www.gotw.ca/gotw/043.htm). Therefore if I use const char* to initialize the string probably it is used in the string as long as possible until the string is not modified. But if this is so, there is more or less no reliable way to hash strings since these can be implicitly converted from const char* to std::string and afterwards used as a hash key and return a wrong hash result. Unfortunately I cannot step inside of the string constructors in MSVC 8.0 to see how these are really implemented.
I think this has nothing to do with COW strings: even if this optimization is in effect, hashing a pointer will never yield the same value as hashing the associated contents, so if the index is based on std::strings (COW-based or not) you cannot expect to locate a given string str by using the hash value of a pointer to str's contents --your own experiments with boost::hash described above must have shown you precisely this. When I said before that the std::string-based create_type must work and that based on const char * must fail, I made a mistake: I've examined the issue more carefully and realized that *both* versions work, albeit not because of COW-related reasons. When you define a std::string-keyed index, that index stores internally an object of type boost::hashstd::string, let's call it h. Now, when you issue a call like types_.get<hash>().find(name.c_str()); The internal code of B.MI calculates the hash value of the argument you've passed by invoking h(arg); // arg is the argument passed, i.e. name.c_str() As this h is of type boost::hashstd::string, its operator() accepts arguments of type std::string, and we're passing a const char*. Given that const char* is implicitly convertible to std::string, a temporary string is created automatically on the fly with the same contents as those pointed to by arg, so the correct hash value is computed and no pointer is actually hashed. I'm sorry for having wrongly stated that passing a const char* must not work --it works, although the way it does is admittedly a little convoluted. So, if we agree on this, there's a little mistery left: since after double-checking we both agree the two versions of create_type (based on std::string and on const char *) should work, what was your original problem about then?
This is not a trivial issue to hash strings I think. Thanks for your time.
If there's something still unclear about the above explanation, please tell me so. Best, Joaquín M López Muñoz Telefónica, Investigación y Desarrollo