
On Thu, April 26, 2007 12:29, Joaquín Mª López Muñoz wrote:
OK, what's happenning is the following: boost::hash<std::string>()(str) is not equivalent to boost<const char*>()(str.c_str()); the latter hashes the *pointer*, not the contents pointed to. So, when you write
your_container.find(str.c_str())
Well, looking through the source of boost::hash I came up with a question which of the hash_value overload functions get called, especially when there is are functions: template <class T> std::size_t hash_value(T* const&); template <class Ch, class A> std::size_t hash_value(std::basic_string<Ch, std::BOOST_HASH_CHAR_TRAITS<Ch>, A> const&); I considered string the string ctor sring(const Ch*, allocator_type& = allocator_type()) is not marked explicit and compiler could prefer this match. Debugging through the boost::hash I found the following: Case 1: char const* p = "Sint8"; size_t val0 = boost::hash_value(p); hash_value(T* const&) is called and it hashes the pointer size_t val1 = boost::hash_value("Sint8"); hash_value(T (&array)[N]) is called and it results in hash_range function call std::string s("Sint8"); size_t val2 = boost::hash_value(s); hash_value(std::string const&) is called and it also results in hash_range function call Despite the similar hash_range call hash_values are different. hash_range iterates through the range of elements by applying a hash_combine function to the previous result (seed) and the current element. Here the source of hash_range and hash_combine function: template <class T> inline void hash_combine(std::size_t& seed, T const& v) { boost::hash<T> hasher; seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2); } template <class It> inline std::size_t hash_range(It first, It last) { std::size_t seed = 0; for(; first != last; ++first) { hash_combine(seed, *first); } return seed; } Despite the fact, that hash_range is overloaded since I have different iterator types (string::iterator and array pointer) type of parameter v in hash_combine is in both cases const char&. And the debugger shows that hasher from hash_combine is in both cases of type boost::hash<char>. Therefore I don't understand how hash_combine can produce different results at this point. But I think I got it!!! The terminating null character is also hashed. Sorry for the long explanation and then the short idea, hope this is of iterest to others...
B.MI assumes that what you pass (a const char *) is equivalent hash- and equivalence-wise to the stored keys (std::strings), which is not true (because of the hash, equivalence is indeed interoperable as std::string provides the required overloads for operator== etc.)
Returning to your original problem, you said that this works:
inline some_type_ptr create_type(std::string const& name)const { types_map::index<hash>::type::const_iterator i =types_.get<hash>().find(name.c_str()); //... }
but this does not:
inline some_type_ptr create_type(std::string const& name)const { types_map::index<hash>::type::const_iterator i =types_.get<hash>().find(name); //... }
This puzzles me a lot: Given that your types_map container is indexed on a std::string, things should be the other way around: it is the "find(name)" version that should work AFAICS. Could you please double-check?
I double checked it and both give the correct hash value. I think this is string dependent issue, where the string uses COW idiom to save performance, Herb Sutter wrote about it in his Guru of the Week (http://www.gotw.ca/gotw/043.htm). Therefore if I use const char* to initialize the string probably it is used in the string as long as possible until the string is not modified. But if this is so, there is more or less no reliable way to hash strings since these can be implicitly converted from const char* to std::string and afterwards used as a hash key and return a wrong hash result. Unfortunately I cannot step inside of the string constructors in MSVC 8.0 to see how these are really implemented. This is not a trivial issue to hash strings I think. Thanks for your time. With Kind Regards, Ovanes Markarian