[Unordered] [Multi-Index] Set lookup from something else than key_type

We currently have a unordered_map< Uid, shared_ptr<Identity> >, where Identity already contains the Uid. So I thought I could avoid the duplicated Uid by using an unordered_set< shared_ptr<Identity>, MyHash, MyPred > to hash/compare by the internal Uid, but unlike a Multi-Index hashed_index [1], unordered doesn't seem to provide: "lookup operations accepting search keys different from the key_type of the index, which is a specially useful facility when key_type object are expensive to create." Any reason why unordered cannot allow similar "compatible key" lookups? What kind of overhead is there (memory use, runtime), if any, in using Multi-Index as opposed to unordered_set in this scenario? Using Multi-Index yields a more complex type and slower compiles it seems due to its meta-programming uses I guess. Thanks for any insight. --DD [1] http://www.boost.org/doc/libs/1_40_0/libs/multi_index/doc/tutorial/basics.ht...

Dominique Devienne <ddevienne <at> gmail.com> writes:
We currently have a unordered_map< Uid, shared_ptr<Identity> >, where Identity already contains the Uid.
So I thought I could avoid the duplicated Uid by using an unordered_set< shared_ptr<Identity>, MyHash, MyPred > to hash/compare by the internal Uid, but unlike a Multi-Index hashed_index [1],unordered doesn't seem to provide:
"lookup operations accepting search keys different from the key_type of the index, which is a specially useful facility when key_type object are expensive to create."
Any reason why unordered cannot allow similar "compatible key" lookups?
There is no technical reason why Boost.Unordered couldn't provide the compatible key functionality. It just happens not to be a feature condidered in the standard, but if you convince the author to add it as an extension implementing it should be trivial.
What kind of overhead is there (memory use, runtime), if any, in using Multi-Index as opposed to unordered_set in this scenario?
No memory overhead, both data structures are equivalent in terms of their internal data structures (for the unique keys case). Speedwise both they should also perform approximately the same, my advice is that you do some profiling to make sure. As you say, Boost.MultiIndex is a heavier library and compile times will reflect that. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

2009/9/1 Joaquin M Lopez Munoz <joaquin@tid.es>:
Dominique Devienne <ddevienne <at> gmail.com> writes:
Any reason why unordered cannot allow similar "compatible key" lookups?
There is no technical reason why Boost.Unordered couldn't provide the compatible key functionality. It just happens not to be a feature condidered in the standard, but if you convince the author to add it as an extension implementing it should be trivial.
AFAICT if the hash function doesn't support the type that you supply it will implicitly construct a key for every call to the hash function and equality predicate. Which is a problem. Daniel

Daniel James escribió:
2009/9/1 Joaquin M Lopez Munoz <joaquin@tid.es>:
Dominique Devienne <ddevienne <at> gmail.com> writes:
Any reason why unordered cannot allow similar "compatible key" lookups?
There is no technical reason why Boost.Unordered couldn't provide the compatible key functionality. It just happens not to be a feature condidered in the standard, but if you convince the author to add it as an extension implementing it should be trivial.
AFAICT if the hash function doesn't support the type that you supply it will implicitly construct a key for every call to the hash function and equality predicate. Which is a problem.
Yep, the hash function and equality predicate the user provides have to have overloads of operator() for the compatible key as well as the key, as shown in the example at: http://www.boost.org/libs/multi_index/doc/tutorial/indices.html#hash_lookup This way no key is unnecessarily constructed, which is the point of the compatible key idea. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

2009/9/2 <joaquin@tid.es>:
AFAICT if the hash function doesn't support the type that you supply it will implicitly construct a key for every call to the hash function and equality predicate. Which is a problem.
Yep, the hash function and equality predicate the user provides have to have overloads of operator() for the compatible key as well as the key, as shown in the example at:
http://www.boost.org/libs/multi_index/doc/tutorial/indices.html#hash_lookup
This way no key is unnecessarily constructed, which is the point of the compatible key idea.
I don't think I can require custom functions to comply with that. There'd have to be some kind of type trait to indicate support for the type. It might be possible to detect that using SFINAE. Daniel

Daniel James escribió:
2009/9/2 <joaquin@tid.es>:
Yep, the hash function and equality predicate the user provides have to have overloads of operator() for the compatible key as well as the key, as shown in the example at:
http://www.boost.org/libs/multi_index/doc/tutorial/indices.html#hash_lookup
This way no key is unnecessarily constructed, which is the point of the compatible key idea.
I don't think I can require custom functions to comply with that. There'd have to be some kind of type trait to indicate support for the type. It might be possible to detect that using SFINAE.
You don't need SFINAE in order to implement compatible keys, it's much simpler than that, let me sketch how it goes. Say for instance we want to extend your find member function to support compatible keys. Currently this member function looks like: const_iterator find(const key_type& k) const { // uses hash_function()(k), key_eq()(k,x), key_eq()(x,k) } Now you only have to templatize find (and the internal member functions this relies on) on the key type, while leaving the code itself untouched: template<typename CompatibleKey> const_iterator find(const CompatibleKey& k) const { // uses hash_function()(k), key_eq()(k,x), key_eq()(x,k) } When the user calls find with a regular key_type value, everything works as it used to. When the value passed is of some other type, the code will automatically use the correct overloads of hash_function and key_eq, provided the user has supplied the overloaded stuff as required. So, no need to distinguish compatible from regular keys and no need to play SFINAE. Additionally, Boost.MultiIndex has overloads of the lookup functions where the user can provide specific versions of the hash and predicate functors: template< typename CompatibleKey,typename CompatibleHash,typename CompatiblePred
iterator find( const CompatibleKey& k, const CompatibleHash& hash,const CompatiblePred& eq)const; Having this is also trivial: the code is basically the same as the regular find except that the objects passed are used instead of the internal hash and equality functors. You can then make the old lookup member functions forward to the new ones: template< typename CompatibleKey,typename CompatibleHash,typename CompatiblePred
iterator find( const CompatibleKey& k, const CompatibleHash& hash,const CompatiblePred& eq)const { // uses hash(k), eq(k,x), eq(x,k) } template<typename CompatibleKey> const_iterator find(const CompatibleKey& k) const { return find(k,hash_function(),key_eq()); } I hope the subject is a little clearer now. Please come back if it's not. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

2009/9/2 <joaquin@tid.es>:
You don't need SFINAE in order to implement compatible keys, it's much simpler than that, let me sketch how it goes. Say for instance we want to extend your find member function to support compatible keys. Currently this member function looks like:
const_iterator find(const key_type& k) const { // uses hash_function()(k), key_eq()(k,x), key_eq()(x,k) }
Now you only have to templatize find (and the internal member functions this relies on) on the key type, while leaving the code itself untouched:
template<typename CompatibleKey> const_iterator find(const CompatibleKey& k) const { // uses hash_function()(k), key_eq()(k,x), key_eq()(x,k) }
When the user calls find with a regular key_type value, everything works as it used to. When the value passed is of some other type, the code will automatically use the correct overloads of hash_function and key_eq, provided
But those overloads don't always exist. A simple example is: struct hash { std::size_t operator()(std::string const&) const; }; struct equals { bool operator()(std::string const&, std::string const&) const; }; boost::unordered_map<std::string, int, hash, equals> map; // .... map.find("char array"); A single string object should be created in this case. Daniel

Daniel James escribió:
2009/9/2 <joaquin@tid.es>:
You don't need SFINAE in order to implement compatible keys, it's much simpler than that, let me sketch how it goes. Say for instance we want to extend your find member function to support compatible keys. Currently this member function looks like:
const_iterator find(const key_type& k) const { // uses hash_function()(k), key_eq()(k,x), key_eq()(x,k) }
Now you only have to templatize find (and the internal member functions this relies on) on the key type, while leaving the code itself untouched:
template<typename CompatibleKey> const_iterator find(const CompatibleKey& k) const { // uses hash_function()(k), key_eq()(k,x), key_eq()(x,k) }
When the user calls find with a regular key_type value, everything works as it used to. When the value passed is of some other type, the code will automatically use the correct overloads of hash_function and key_eq, provided
But those overloads don't always exist. A simple example is:
struct hash { std::size_t operator()(std::string const&) const; };
struct equals { bool operator()(std::string const&, std::string const&) const; };
boost::unordered_map<std::string, int, hash, equals> map; // .... map.find("char array");
A single string object should be created in this case.
Correct, but it's the user's responsibility, not yours, to make sure the hash and equality functors used have the appropriate overloads. It's the user whose violating the member function usage interface --strictly speaking she is not violating it, as hash accepts a const char* through the construction of a temporary string, and similarly for equals, but of course this is not the intended behavior. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

2009/9/2 <joaquin@tid.es>:
Daniel James escribió:
But those overloads don't always exist. A simple example is:
struct hash { std::size_t operator()(std::string const&) const; };
struct equals { bool operator()(std::string const&, std::string const&) const; };
boost::unordered_map<std::string, int, hash, equals> map; // .... map.find("char array");
A single string object should be created in this case.
Correct, but it's the user's responsibility, not yours, to make sure the hash and equality functors used have the appropriate overloads.
I disagree, I'd see what I wrote as reasonable use of the STL.

Daniel James escribió:
2009/9/2 <joaquin@tid.es>:
Daniel James escribió:
But those overloads don't always exist. A simple example is:
struct hash { std::size_t operator()(std::string const&) const; };
struct equals { bool operator()(std::string const&, std::string const&) const; };
boost::unordered_map<std::string, int, hash, equals> map; // .... map.find("char array");
A single string object should be created in this case.
Correct, but it's the user's responsibility, not yours, to make sure the hash and equality functors used have the appropriate overloads.
I disagree, I'd see what I wrote as reasonable use of the STL.
Yep, I now see your POV and this is admittedly something people can disagree on. Another possibility is that Boost.Unordered provide const_iterator find(const key_type& k) const; template< typename CompatibleKey,typename CompatibleHash,typename CompatiblePred
iterator find( const CompatibleKey& k, const CompatibleHash& hash,const CompatiblePred& eq)const; which is a pure extension without backwards compatibility problems. This is seemingly what Boost.Intrusive is doing. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

2009/9/2 <joaquin@tid.es>:
Yep, I now see your POV and this is admittedly something people can disagree on. Another possibility is that Boost.Unordered provide
const_iterator find(const key_type& k) const;
template< typename CompatibleKey,typename CompatibleHash,typename CompatiblePred > iterator find( const CompatibleKey& k, const CompatibleHash& hash,const CompatiblePred& eq)const;
which is a pure extension without backwards compatibility problems. This is seemingly what Boost.Intrusive is doing.
Yes, I think I'll add that to the new version which should be released in 1.41 or 1.42. Daniel

On Wed, Sep 2, 2009 at 7:23 AM, Daniel James<daniel_james@fmail.co.uk> wrote:
template< typename CompatibleKey,typename CompatibleHash,typename CompatiblePred > iterator find( const CompatibleKey& k, const CompatibleHash& hash,const CompatiblePred& eq)const;
which is a pure extension without backwards compatibility problems. This is seemingly what Boost.Intrusive is doing.
Yes, I think I'll add that to the new version which should be released in 1.41 or 1.42.
This would indeed cleanly solve my problem. FTR, I tend to lean on Joaquín's side of the argument. If the provided explicit Hash/Pred template parameters of unordered_set do provide overloads for compatible keys, then it's "by design" that find() should use them. If previous code that was relying on an implicit conversion to key_type starts using added overload on Pred/Hash, this simply means that the container designer made this choice explicitly, and I don't see why that a bad thing. After all the overloads were added for avoiding the key creation, which is exactly what I was trying to do here. In Joaquín's proposal, all clients using find() implicitly start using the optimized find() avoiding the key_type conversion once the container is modified and the client code is recompiled. No code necessary. That's a good thing to me. But that's just me ;) --DD

Dominique Devienne <ddevienne <at> gmail.com> writes:
FTR, I tend to lean on Joaquín's side of the argument. If the provided explicit Hash/Pred template parameters of unordered_set do provide overloads for compatible keys, then it's "by design" that find() should use them.
If previous code that was relying on an implicit conversion to key_type starts using added overload on Pred/Hash, this simply means that the container designer made this choice explicitly, and I don't see why that a bad thing. After all the overloads were added for avoiding the key creation, which is exactly what I was trying to do here.
The problem James points to is not the one you describe, but the following: imagine Hash and Pred are not overloaded and another_type is a type implicitly convertible to key_type (in the example given by James, key_type=std::string, another_type=const char*). Without compatible key extensions, the code another_type x=... cont.find(x); is equivalent to another_type x=... cont.find(key_type(x)); But if find is templatized to accept compatible keys, then cont.find(x); will not trigger the implicit conversion to key_type, instead, every internal expression involving invocations to the Hash and Pred internal objects with x as argument will trigger the conversion. The net effect is that we pass from one implicit conversion when invoking find() to multiple implicit conversions inside find() code. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

On Wed, Sep 2, 2009 at 2:17 PM, Joaquin M Lopez Munoz<joaquin@tid.es> wrote:
[...] The net effect is that we pass from one implicit conversion when invoking find() to multiple implicit conversions inside find() code.
Ah yes, that's unfortunate. Thanks for making that clear. A "simple" solution is to not overload, and rename the find() overload into find_compatible_key(another_type) to remove this possibility, while not having to provide explicitly the hash/pred to find(ckey,hash,pred): typedef boost::unordered_set< my_type, hash_by_uid<my_type>, compare_by_uid<my_type>
my_type_set;
my_type::uid_t some_uid = ...; const my_type_set& set = ...; my_type_set::iterator found1 = set.find( some_uid, my_type_set::hasher(), my_type_set::key_equal() ); my_type_set::iterator found2 = set.find_compatible_key(some_uid); assert(found1 == found2); The 1-arg find_compatible_key(compatible_key_t) assumes appropriate overloads for compatible_key_t in hash/pred, and delegates to the containers hash/pred and the 3-arg find(). Can even be a free template function in fact. --DD

Am Wednesday 02 September 2009 12:47:56 schrieb Daniel James:
But those overloads don't always exist. A simple example is:
struct hash { std::size_t operator()(std::string const&) const; };
struct equals { bool operator()(std::string const&, std::string const&) const; };
boost::unordered_map<std::string, int, hash, equals> map; // .... map.find("char array");
A single string object should be created in this case.
Boost.Intrusive implements this type of feature in its intrusive unordered set: http://www.boost.org/doc/libs/1_40_0/doc/html/boost/intrusive/unordered_set.... and I just recently had to use a unordered_map<key,mapped> instead of a unordered_set<mapped>, even though the key is obtainable from mapped, because mapped is not constructible from key.
participants (5)
-
Daniel James
-
Dominique Devienne
-
Joaquin M Lopez Munoz
-
joaquin@tid.es
-
Stefan Strasser