Hello,
I would like express my respect and congratulations for the provision
of boost 1.36 (which had an impressive short release cycle)!
One of the very promising new contributions is Boost.Unordered and I
I would like to make a comment on the provision of operator==/operator!=
as explained in
http://www.boost.org/doc/libs/1_36_0/doc/html/unordered/rationale.html#unord...
I totally agree that there exist valid and reasonable use-cases
for EqualityComparable for unordered containers. In fact many
unordered containers (usually named hash containers) exists
in one or the other language or user-defined library which
provide some form of "unordered" equality comparison, which
reflects the unordered nature of the container itself and
thus matches a typical user expectation.
Nevertheless there exists an unfortunate side-effect of the
current implementation of operator== and operator!=, which
should be mentioned in the documentation IMO. To keep a long
story short: Under some given conditions, EqualityComparable of
Boost.Unordered containers will violate the very basic "symmetry"
property of EqualityComparable. This can happen, if the equality
predicate key_eq() is a runtime-predicate and if the actual
comparison is done between two different runtime-comparators.
Note that associative containers - even though they also support
this kind of runtime/dynamic comparison - are *not* sensitive
to this unexpected behaviour. The simple reason is that the
implementation uses an asymmetric algorithm to realize the
concept.
The program provided at tghe end of this posting demonstrates
this kind of violation of the usual symmetric property of
EqualityComparable. The output should be:
s1 == s2: 1
s2 == s1: 0
I hope this comment is not understood as some form of destructive
criticism. AFAIK there does not exist any general valid algorithm
which can realize a pure *symmetric* unordered equality comparison
with an average linear complexitity.
My point is, that I strongly propose to mention this unexpected
property and/or to declare code as causing undefined behaviour
if it attempts to use EqualityComparable of unordered containers
if the predicate is dynamic and different between the compared
containers.
Greetings from Bremen,
Daniel
#include <cctype>
#include <iostream>
#include <ostream>
#include
2008/8/18 Daniel Krügler
I hope this comment is not understood as some form of destructive criticism. AFAIK there does not exist any general valid algorithm which can realize a pure *symmetric* unordered equality comparison with an average linear complexitity.
Not at all, it's a good point. I should have written more about the potential issues (and perhaps by extension thought about them more). I could have implemented this in a 'safer' way if I could compare the hash and equality functions, I could do that for function pointers, but not for function objects. Although with C++0x concepts, I think we might be able to detect if a function object if comparable. We could do the comparison in both directions which would at least be symmetric. Although it would have still have surprising results for situations like the one you described and I don't think people would accept the efficiency hit. Anyway, I'll add something to the reference documentation and the custom hash function/equality operator section to explain the pitfalls. thanks, Daniel
Daniel and David, Thank you very much for your replies! I use Daniel's posting as the combined reply channel: Daniel James wrote:
2008/8/18 Daniel Krügler
: I hope this comment is not understood as some form of destructive criticism. AFAIK there does not exist any general valid algorithm which can realize a pure *symmetric* unordered equality comparison with an average linear complexitity.
Not at all, it's a good point. I should have written more about the potential issues (and perhaps by extension thought about them more).
I could have implemented this in a 'safer' way if I could compare the hash and equality functions, I could do that for function pointers, but not for function objects. Although with C++0x concepts, I think we might be able to detect if a function object if comparable.
We could do the comparison in both directions which would at least be symmetric. Although it would have still have surprising results for situations like the one you described and I don't think people would accept the efficiency hit.
David Abrahams wrote:
I suggest requiring that the predicates themselves be equality comparable, and that equal predicates perform the same computation. Containers with unequal predicates would not be considered equal.
Yes, this proposed resolution has also be given by Pablo Halpern in a different mailing channnel. I totally agree that some tagging technique (or concepts in C++0x) would be good idea. I had some resistance concerning requiring EqualityComparable for non-empty predicates (This was his very nice idea: Distinguish empty from non-empty predicates and require EqualityComparable only for non-empty ones), because I thought it would be too intrusive. Maybe I was to strict in my position. The reason for my resistance is that in several scenarios an object type with overloaded operator() has *more* than one operator() overload and by requiring operator== this would have the effect that we don't know, for *which* overload of operator() this EqualityComparable concept would apply. If a yet-to-be-defined concept would be used that would also "encode" the argument types of the comparison, this would solve this particular problem [Note: One might argue that Allocators also provide operator==, but IMO it is a much rarer situation, that an allocator *also* implements a *different* allocator]. Daniel James wrote:
Anyway, I'll add something to the reference documentation and the custom hash function/equality operator section to explain the pitfalls.
IMO, this is an excellent idea. - Daniel
on Mon Aug 18 2008, Daniel Krügler
David Abrahams wrote:
I suggest requiring that the predicates themselves be equality comparable, and that equal predicates perform the same computation. Containers with unequal predicates would not be considered equal.
Yes, this proposed resolution has also be given by Pablo Halpern in a different mailing channnel. I totally agree that some tagging technique (or concepts in C++0x) would be good idea.
I don't see what tagging has to do with my suggestion, and concepts are certainly not applicable because we're discussing runtime tests.
I had some resistance concerning requiring EqualityComparable for non-empty predicates (This was his very nice idea: Distinguish empty from non-empty predicates and require EqualityComparable only for non-empty ones),
That's not a bad idea.
because I thought it would be too intrusive. Maybe I was to strict in my position. The reason for my resistance is that in several scenarios an object type with overloaded operator() has *more* than one operator() overload and by requiring operator== this would have the effect that we don't know, for *which* overload of operator() this EqualityComparable concept would apply.
An operator== that applied for only one of many overloads would be an abomination, especially if the class were empty. -- Dave Abrahams BoostPro Computing http://www.boostpro.com
David Abrahams wrote:
on Mon Aug 18 2008, Daniel Krügler
wrote: David Abrahams wrote:
I suggest requiring that the predicates themselves be equality comparable, and that equal predicates perform the same computation. Containers with unequal predicates would not be considered equal.
Yes, this proposed resolution has also be given by Pablo Halpern in a different mailing channnel. I totally agree that some tagging technique (or concepts in C++0x) would be good idea.
I don't see what tagging has to do with my suggestion, and concepts are certainly not applicable because we're discussing runtime tests.
I apologize, my nomenclature was wrong. Technically I meant the application of a SFINAE mechanism with a corresponding compile-time predicate.
I had some resistance concerning requiring EqualityComparable for non-empty predicates (This was his very nice idea: Distinguish empty from non-empty predicates and require EqualityComparable only for non-empty ones),
That's not a bad idea.
because I thought it would be too intrusive. Maybe I was to strict in my position. The reason for my resistance is that in several scenarios an object type with overloaded operator() has *more* than one operator() overload and by requiring operator== this would have the effect that we don't know, for *which* overload of operator() this EqualityComparable concept would apply.
An operator== that applied for only one of many overloads would be an abomination, especially if the class were empty.
Right, if the class is empty, all operator() overloads
are stateless, but this is not necessarily the case,
if at least one of them is statefull (and therefore the
complete class type is not empty). There are still good
chances, that some (or even all) of them are still stateless
(because they are not influenced by that state).
Or to express the problem in different words: In general
there exists a one-to-many relation between a class
type and it's operator() overloads [acting as predicates],
so the class-type alone is not sufficient to define
an equality of *one* special operator() overload, which
we are interested in. Therefore the predicate equality needs
to be restricted to a given predicate (a given operator()
overload).
Yesterday night I wrote some code (with concepts) to bind the
predicate of a predicate ;-) to a specific argument-type
sequence of the corresponding operator() overload. I'm quite
sure that this can be retrofitted with C++03 means (SFINAE
and friends). Here is a simplified sketch of this in terms
of C++03 (with many options to do it even better ;-) ):
#include <cstring>
#include <iostream>
#include <ostream>
typedef bool (*streq_t)(const char*, const char*);
bool strcmp_equal(const char* a, const char* b) {
return std::strcmp(a, b) == 0;
}
bool strcoll_equal(const char* a, const char* b) {
return std::strcoll(a, b) == 0;
}
// Define USE_EMPTY to make S an empty structure:
//#define USE_EMPTY
// Define ALL_EMPTY to let "is_empty" (as defined below)
// accept everything as empty. Less insane code would use
// boost::is_empty, of course:
//#define ALL_EMPTY
// A class type with more than one operator() overload.
// Depending on the USE_EMPTY #define some are stateful
// or all are stateless:
struct S {
#ifndef USE_EMPTY
streq_t scmp;
explicit S(streq_t sc = strcmp_equal) : scmp(sc) {}
#endif
bool operator()(int a, int b) const {
return a == b;
}
bool operator()(const char* a, const char* b) const {
#ifndef USE_EMPTY
return scmp(a, b);
#else
return strcol_equal(a, b);
#endif
}
};
// Don't take this too serious - use boost::enable_if(_c):
template ::type> {
static bool eq(const P& a, const P& b) {
std::cout << "Primary" << std::endl;
return &a == &b;
}
};
template ::type> {
static bool eq(P, P) {
std::cout << "IsEmpty spec." << std::endl;
return true;
}
};
template <>
struct comparable_predicate ::eq(p1, p2))
return p1(arg1, arg2) && p2(arg1, arg2);
else
return false;
}
int main() {
S s1, s2;
bool res = compare(s1, s2, 42, 42);
std::cout << res << std::endl;
res = compare(s1, s2, "A", "A");
std::cout << res << std::endl;
s2.scmp = strcoll_equal;
res = compare(s1, s2, "A", "A");
std::cout << res << std::endl;
std::cin.get();
}
Concept code is quite similar, with the syntactic sugar
that you can write p1 == p2 in the constrained compare
function (I was too lazy to constrain the C++03 compare()
above) instead of comparable_predicate ::eq(p1, p2).
The advantage is that you can specialize the predicate for
each overload of operator() separately.
What do you think?
Greetings from Bremen,
- Daniel {
static bool eq(S, S) {
std::cout << "(S, int) spec." << std::endl;
return true;
}
};
#ifndef USE_EMPTY
template <>
struct comparable_predicate {
static bool eq(S a, S b) {
std::cout << "(S, const char*) spec." << std::endl;
return a.scmp == b.scmp;
}
};
#endif
template
on Tue Aug 19 2008, Daniel Krügler
An operator== that applied for only one of many overloads would be an abomination, especially if the class were empty.
Right, if the class is empty, all operator() overloads are stateless,
What about random number generators (or anything else that uses global state?)
but this is not necessarily the case, if at least one of them is statefull (and therefore the complete class type is not empty). There are still good chances, that some (or even all) of them are still stateless (because they are not influenced by that state).
Or to express the problem in different words: In general there exists a one-to-many relation between a class type and it's operator() overloads [acting as predicates], so the class-type alone is not sufficient to define an equality of *one* special operator() overload, which we are interested in. Therefore the predicate equality needs to be restricted to a given predicate (a given operator() overload).
I think you're over-engineering this. It's not unreasonable to require operator== to make sense in this context. -- Dave Abrahams BoostPro Computing http://www.boostpro.com
David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: An operator== that applied for only one of many overloads would be an abomination, especially if the class were empty.
Right, if the class is empty, all operator() overloads are stateless,
What about random number generators (or anything else that uses global state?)
Note that I'm restricting on EqualityComparable for predicates (therefore I carefully choose the name comparable_predicate). Can you think of any reasonable *predicate* with global state?
but this is not necessarily the case, if at least one of them is statefull (and therefore the complete class type is not empty). There are still good chances, that some (or even all) of them are still stateless (because they are not influenced by that state).
Or to express the problem in different words: In general there exists a one-to-many relation between a class type and it's operator() overloads [acting as predicates], so the class-type alone is not sufficient to define an equality of *one* special operator() overload, which we are interested in. Therefore the predicate equality needs to be restricted to a given predicate (a given operator() overload).
I think you're over-engineering this. It's not unreasonable to require operator== to make sense in this context.
This answer is a bit too short for me and has not
the convincing power which we usually get from your
contributions ;-)
But I let it settle for a while - as I already said:
I might have a bit too strict point of view on this
and I would really appreciate further comments. If
this would no longer match the boost-relevant context
please send your mail to me - either of
On Aug 20, 2008, at 2:20 AM, Daniel Krügler wrote:
David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: [SNIP] Or to express the problem in different words: In general there exists a one-to-many relation between a class type and it's operator() overloads [acting as predicates], so the class-type alone is not sufficient to define an equality of *one* special operator() overload, which we are interested in. Therefore the predicate equality needs to be restricted to a given predicate (a given operator() overload). I think you're over-engineering this. It's not unreasonable to require operator== to make sense in this context.
This answer is a bit too short for me and has not the convincing power which we usually get from your contributions ;-)
But I let it settle for a while - as I already said: I might have a bit too strict point of view on this and I would really appreciate further comments.
I don't know if what I'm going to say was what David was thinking of, but... If a predicate function object has several "operator ()" overloads, they should be conceptually identical. Ideally, they should be "const" member functions that depend only on the values of the inputs and any internal state data members. Even without that, your predicate class should represent ONE kind of evaluation criteria to pass judgement on. Your nightmare predicate would have to have all the "operator ()" inconsistent with each other, i.e. differing criteria. That would make it useless, considering you can't choose which overload you get in normal use. You can choose a specific version with Stupid C++ Tricks involving casting the inputs and/or the member function, but that won't help with generic (template- based) code. Testing operations that aren't the same shouldn't be part of the same predicate class. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com
Daryle Walker wrote:
On Aug 20, 2008, at 2:20 AM, Daniel Krügler wrote:
David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: [SNIP]
Or to express the problem in different words: In general there exists a one-to-many relation between a class type and it's operator() overloads [acting as predicates], so the class-type alone is not sufficient to define an equality of *one* special operator() overload, which we are interested in. Therefore the predicate equality needs to be restricted to a given predicate (a given operator() overload).
I think you're over-engineering this. It's not unreasonable to require operator== to make sense in this context.
This answer is a bit too short for me and has not the convincing power which we usually get from your contributions ;-)
But I let it settle for a while - as I already said: I might have a bit too strict point of view on this and I would really appreciate further comments.
I don't know if what I'm going to say was what David was thinking of, but...
If a predicate function object has several "operator ()" overloads, they should be conceptually identical. Ideally, they should be "const" member functions that depend only on the values of the inputs and any internal state data members. Even without that, your predicate class should represent ONE kind of evaluation criteria to pass judgement on. Your nightmare predicate would have to have all the "operator ()" inconsistent with each other, i.e. differing criteria. That would make it useless, considering you can't choose which overload you get in normal use. You can choose a specific version with Stupid C++ Tricks involving casting the inputs and/or the member function, but that won't help with generic (template- based) code. Testing operations that aren't the same shouldn't be part of the same predicate class.
Thank you very much for your reply and apologies for my late response! My main concern was that the standard should not be too intrusive, but your arguments made me reflect again on this. Users should easily be able to separate their predicate "bundle", if their see strong need for such a bundle, e.g. instead of struct S { bool (*scmp)(const char*, const char*); explicit S(streq_t sc = ...) : scmp(sc) {} // Stateless predicate: bool operator()(int a, int b) const { return a == b; } ... // other stateless predicates // Statefull predicate: bool operator()(const char* a, const char* b) const { return scmp(a, b); } }; they could (and probably should) write: struct SPure { // Stateless predicate: bool operator()(int a, int b) const { return a == b; } ... // other stateless predicates }; struct S : SPure { bool (*scmp)(const char*, const char*); explicit S(streq_t sc = ...) : scmp(sc) {} // Statefull predicate(s): bool operator()(const char* a, const char* b) const { return scmp(a, b); } }; Inheritance is neither required nor usually wanted, but if some-one *needs* this bundle (because of external requirements) she can realize this, if she wants that. Thank you for your help! - Daniel
On Aug 21, 2008, at 3:48 AM, Daniel Krügler wrote:
Daryle Walker wrote: [SNIP]
I don't know if what I'm going to say was what David was thinking of, but... If a predicate function object has several "operator ()" overloads, they should be conceptually identical. Ideally, they should be "const" member functions that depend only on the values of the inputs and any internal state data members. Even without that, your predicate class should represent ONE kind of evaluation criteria to pass judgement on. Your nightmare predicate would have to have all the "operator ()" inconsistent with each other, i.e. differing criteria. That would make it useless, considering you can't choose which overload you get in normal use. You can choose a specific version with Stupid C++ Tricks involving casting the inputs and/or the member function, but that won't help with generic (template- based) code. Testing operations that aren't the same shouldn't be part of the same predicate class.
Thank you very much for your reply and apologies for my late response!
My main concern was that the standard should not be too intrusive, but your arguments made me reflect again on this. Users should easily be able to separate their predicate "bundle", if their see strong need for such a bundle, e.g. instead of
struct S { bool (*scmp)(const char*, const char*); explicit S(streq_t sc = ...) : scmp(sc) {} // Stateless predicate: bool operator()(int a, int b) const { return a == b; } ... // other stateless predicates // Statefull predicate: bool operator()(const char* a, const char* b) const { return scmp(a, b); } };
they could (and probably should) write:
struct SPure { // Stateless predicate: bool operator()(int a, int b) const { return a == b; } ... // other stateless predicates };
struct S : SPure { bool (*scmp)(const char*, const char*); explicit S(streq_t sc = ...) : scmp(sc) {} // Statefull predicate(s): bool operator()(const char* a, const char* b) const { return scmp(a, b); } };
Inheritance is neither required nor usually wanted, but if some-one *needs* this bundle (because of external requirements) she can realize this, if she wants that.
You may have missed something. The state-fulness or statelessness of a predicate is _irrelevant_ to my point, which was that all the comparison overloads should have the same effects! Having your "S" predicate be a root class or divided with a "SPure" base class doesn't matter. Your integer and string comparisons are probably not testing the same concept, so they shouldn't be together, no matter what C++ skills you use to compose them. struct real_part_negative { bool operator ()( int x ) const { return x < 0; } bool operator ()( unsigned x ) const { return false; } bool operator ()( float x ) const { return x < 0.0; } template < typename T > bool operator ()( complex<T> const &x ) const { return x.real < static_cast<T>(0); } }; All these overloads represent the same concept, so it's a consistent predicate. This is important for generic code, where you can't pick which overload is used. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com
on Tue Aug 19 2008, Daniel Krügler
David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: An operator== that applied for only one of many overloads would be an abomination, especially if the class were empty.
Right, if the class is empty, all operator() overloads are stateless,
What about random number generators (or anything else that uses global state?)
Note that I'm restricting on EqualityComparable for predicates (therefore I carefully choose the name comparable_predicate).
Can you think of any reasonable *predicate* with global state?
has_this_time_been_reached( sometime )
Or to express the problem in different words: In general there exists a one-to-many relation between a class type and it's operator() overloads [acting as predicates], so the class-type alone is not sufficient to define an equality of *one* special operator() overload, which we are interested in. Therefore the predicate equality needs to be restricted to a given predicate (a given operator() overload).
I think you're over-engineering this. It's not unreasonable to require operator== to make sense in this context.
This answer is a bit too short for me and has not the convincing power which we usually get from your contributions ;-)
Thanks; I've been under the weather. Still, I don't know how to say it much better. -- Dave Abrahams BoostPro Computing http://www.boostpro.com
On Aug 21, 2008, at 9:23 PM, David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: [SNIP]
Or to express the problem in different words: In general there exists a one-to-many relation between a class type and it's operator() overloads [acting as predicates], so the class-type alone is not sufficient to define an equality of *one* special operator() overload, which we are interested in. Therefore the predicate equality needs to be restricted to a given predicate (a given operator() overload).
I think you're over-engineering this. It's not unreasonable to require operator== to make sense in this context.
This answer is a bit too short for me and has not the convincing power which we usually get from your contributions ;-)
Thanks; I've been under the weather. Still, I don't know how to say it much better.
Have you seen my responses yet? I think I got what you were saying. Basically, if there's a one-to-many relationship between a predicate class and its "operator ()" overloads, that means that the various overloads aren't consistent with each other. Such a predicate class is useless, especially in generic code, where you can't choose which overload is chosen. (Note that the amount of state, if any, is irrelevant here.) The predicate has to represent an ideal criteria and _all_ the "operator ()" overloads have to be conceptually identical in supporting that ideal. -- Daryle Walker Mac, Internet, and Video Game Junkie darylew AT hotmail DOT com
on Thu Aug 21 2008, Daryle Walker
On Aug 21, 2008, at 9:23 PM, David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: David Abrahams wrote:
on Tue Aug 19 2008, Daniel Krügler
wrote: [SNIP]
Or to express the problem in different words: In general there exists a one-to-many relation between a class type and it's operator() overloads [acting as predicates], so the class-type alone is not sufficient to define an equality of *one* special operator() overload, which we are interested in. Therefore the predicate equality needs to be restricted to a given predicate (a given operator() overload).
I think you're over-engineering this. It's not unreasonable to require operator== to make sense in this context.
This answer is a bit too short for me and has not the convincing power which we usually get from your contributions ;-)
Thanks; I've been under the weather. Still, I don't know how to say it much better.
Have you seen my responses yet?
Yes
I think I got what you were saying. Basically, if there's a one-to-many relationship between a predicate class and its "operator ()" overloads, that means that the various overloads aren't consistent with each other. Such a predicate class is useless, especially in generic code, where you can't choose which overload is chosen. (Note that the amount of state, if any, is irrelevant here.) The predicate has to represent an ideal criteria and _all_ the "operator ()" overloads have to be conceptually identical in supporting that ideal.
I think that sounds like what I meant. -- Dave Abrahams BoostPro Computing http://www.boostpro.com
on Mon Aug 18 2008, Daniel Krügler
Hello,
I would like express my respect and congratulations for the provision of boost 1.36 (which had an impressive short release cycle)!
One of the very promising new contributions is Boost.Unordered and I I would like to make a comment on the provision of operator==/operator!= as explained in
http://www.boost.org/doc/libs/1_36_0/doc/html/unordered/rationale.html#unord...
I totally agree that there exist valid and reasonable use-cases for EqualityComparable for unordered containers. In fact many unordered containers (usually named hash containers) exists in one or the other language or user-defined library which provide some form of "unordered" equality comparison, which reflects the unordered nature of the container itself and thus matches a typical user expectation.
Nevertheless there exists an unfortunate side-effect of the current implementation of operator== and operator!=, which should be mentioned in the documentation IMO. To keep a long story short: Under some given conditions, EqualityComparable of Boost.Unordered containers will violate the very basic "symmetry" property of EqualityComparable. This can happen, if the equality predicate key_eq() is a runtime-predicate and if the actual comparison is done between two different runtime-comparators.
I suggest requiring that the predicates themselves be equality comparable, and that equal predicates perform the same computation. Containers with unequal predicates would not be considered equal. -- Dave Abrahams BoostPro Computing http://www.boostpro.com
participants (4)
-
Daniel James
-
Daniel Krügler
-
Daryle Walker
-
David Abrahams