Boost.Algorithm design question

So, what do people prefer (and why?): template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val ) { for ( ; first != last; ++first ) if ( val == *first ) return false; return true; } or template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val ) { for ( ; first != last; ++first ) if ( val == *first ) return false; return true; } In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not. Of course, the (mythical?) smart compiler would do the conversion once and save the result for reuse. -- Marshall Marshall Clow Idio Software <mailto:mclow.lists@gmail.com> A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons. Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries. Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type. It is true that #1 can result in O(N) conversions, but that's really up to the element/value types involved. Typically, whenever the value type is implicitly convertible to the element type, and the element type is comparable with itself, a direct comparison between the two types could also be provided (as is the case with const char * and string), and will be provided when performance is important. Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11. std::map still doesn't, and users complain about it from time to time. STL

On Oct 5, 2011, at 3:53 PM, Stephan T. Lavavej wrote:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.
It is true that #1 can result in O(N) conversions, but that's really up to the element/value types involved. Typically, whenever the value type is implicitly convertible to the element type, and the element type is comparable with itself, a direct comparison between the two types could also be provided (as is the case with const char * and string), and will be provided when performance is important.
Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11. std::map still doesn't, and users complain about it from time to time.
Thanks for explaining the rationale. That's what I was looking for. -- Marshall Marshall Clow Idio Software <mailto:mclow.lists@gmail.com> A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

Stephan T. Lavavej-2 wrote:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
+1 --Lorenzo -- View this message in context: http://boost.2283326.n4.nabble.com/Boost-Algorithm-design-question-tp3876424... Sent from the Boost - Dev mailing list archive at Nabble.com.

Den 06-10-2011 00:53, Stephan T. Lavavej skrev:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const&val )
or
template<typename InputIterator,> bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const&val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.
Even better, don't use vector<const char*>. What o you need that for? [snip]
Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11. std::map still doesn't, and users complain about it from time to time.
Well, option number two has the distinct advantage that it generates less code. Option number one would generate code for each length of a character literal. To avoid such mess, the implementation can try to decay the value an then forward to the real implementation. Or the user can use boost::as_literal(). -Thorsten

Thorsten Ottosen wrote:
Den 06-10-2011 00:53, Stephan T. Lavavej skrev:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const&val )
or
template<typename InputIterator,> bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const&val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.
Even better, don't use vector<const char*>. What o you need that for?
Here's a real motivating example where I want to use vector<const char*>. I have a large file containing null-terminated strings which I memory map. After opening the file, I construct some sort of index of those strings in a sorted vector. Then I search for things using std::equal_range, std::lower_bound, or similar. The important thing is that for efficiency the algorithms mustn't create temporary strings; they must compare the const char* with the string that I'm searching for directly. std::string provides operator overloads to do this. I want the algorithms to use those heterogeneous comparisons. Dave Abrahams has said a couple of times that he doesn't even know what it means to try to do this. This baffles me, but I trust that there is some validity in his argument. But it does lead me to the point that I made in my review, and my answer to Marshall's question at the start of this thread: I don't want Boost to waste time on either of those none_of_equal implementations. We are all capable of writing our own 4-line none_of_equal implementations that do precisely what we want them to do. Why argue about which is the "one true way" to do it, when we can all have whichever one we prefer? Instead, let's spend time on things where the implementation is more than 4 lines long - like a variant of std::list with O(1) splice! Regards, Phil.

Even better, don't use vector<const char*>. What o you need that for?
Here's a real motivating example where I want to use vector<const char*>. I have a large file containing null-terminated strings which I memory map. After opening the file, I construct some sort of index of those strings in a sorted vector. Then I search for things using std::equal_range, std::lower_bound, or similar.
All those algorithms comes in a version that accepts a predicate, so you would still be able to do your optimization.. And it would be clear to anyone who's reading the code what your intention is (to avoid copies), it's simply not a 'coincidence' that there happened to be an overload that matched your types. For clarity and maintenance, I prefer to speak out clearly in code the intention. If it becomes to boilerplate-ly kind of thing, wrap it up in an own algorithm.
The important thing is that for efficiency the algorithms mustn't create temporary strings; they must compare the const char* with the string that I'm searching for directly. std::string provides operator overloads to do this. I want the algorithms to use those heterogeneous comparisons.
I understand the first part about not creating temporaries, but not why you want to use those heterogeneous comparisons. If it's a performance concern, I would prefer being in control of that explicitly. If you're unlucky while working with the code at some point in the future and happen to introduce an implicit copy, everything will compile and run but your speed will be not what you expected. If you're lucky it will crash and can be debugged early on
Dave Abrahams has said a couple of times that he doesn't even know what it means to try to do this. This baffles me, but I trust that there is some validity in his argument. But it does lead me to the point that I made in my review, and my answer to Marshall's question at the start of this thread: I don't want Boost to waste time on either of those none_of_equal implementations. We are all capable of writing our own 4-line none_of_equal implementations that do precisely what we want them to do. Why argue about which is the "one true way" to do it, when we can all have whichever one we prefer? Instead, let's spend time on things where the implementation is more than 4 lines long - like a variant of std::list with O(1) splice!
If we can't get the simple ones right, there's not a healthy ground to base more complicated algorithms on. Also, I don't want to write those 4 lines of code which is potentially dangerous, simply because I wasn't aware of the implications. If I can read in some boost rationale like 'this is how it should be done and why' (many Boost libraries takes this higher moral ground, which I appreciate), then I've learned something.
Regards, Phil.
cheers, Christian

Den 06-10-2011 18:27, Phil Endecott skrev:
Thorsten Ottosen wrote:
Even better, don't use vector<const char*>. What o you need that for? [snip]
Here's a real motivating example where I want to use vector<const char*>. I have a large file containing null-terminated strings which I memory map. After opening the file, I construct some sort of index of those strings in a sorted vector. Then I search for things using std::equal_range, std::lower_bound, or similar.
That's a good example, I grant you. -Thorsten

on Wed Oct 05 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Sorry, STL. The other STL's conventions are wrong. :-)
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
You got lucky this time.
Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.
That's cute. IMO the principled thing to do in that case is to use the version that takes the additional predicate.
It is true that #1 can result in O(N) conversions, but that's really up to the element/value types involved. Typically, whenever the value type is implicitly convertible to the element type, and the element type is comparable with itself, a direct comparison between the two types could also be provided (as is the case with const char * and string), and will be provided when performance is important.
Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11.
/me realizes he is responsible for this change But that's *totally* different. /me fumbles around for the explanation For the version that uses "<", it's an unintentional side-effect of making the specification uniform. The function signature already took an unconstrained object by reference, and my paper was not trying to change that. It just happened to make the /semantics/ of using a type different from the value type of the sequence meaningful in that case. It also make the specification *simpler* overall and easier to understand. The _purpose_ was to support heterogeneity for the version that uses a function. The description in terms of partitioning is a more principled and generic way to describe the actual requirements of the algorithm. There was never any intention to promote the use of heterogeneous "<", which again, doesn't make sense. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
on Wed Oct 05 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Sorry, STL. The other STL's conventions are wrong. :-)
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
You got lucky this time.
Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.
That's cute. IMO the principled thing to do in that case is to use the version that takes the additional predicate.
It is true that #1 can result in O(N) conversions, but that's really up to the element/value types involved. Typically, whenever the value type is implicitly convertible to the element type, and the element type is comparable with itself, a direct comparison between the two types could also be provided (as is the case with const char * and string), and will be provided when performance is important.
Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11.
/me realizes he is responsible for this change
But that's *totally* different.
/me fumbles around for the explanation
For the version that uses "<", it's an unintentional side-effect of making the specification uniform. The function signature already took an unconstrained object by reference, and my paper was not trying to change that. It just happened to make the /semantics/ of using a type different from the value type of the sequence meaningful in that case. It also make the specification *simpler* overall and easier to understand.
The _purpose_ was to support heterogeneity for the version that uses a function. The description in terms of partitioning is a more principled and generic way to describe the actual requirements of the algorithm. There was never any intention to promote the use of heterogeneous "<", which again, doesn't make sense.
From the standard:
// 25.2, non-modifying sequence operations: template <class InputIterator, class Predicate> bool none_of(InputIterator first, InputIterator last, Predicate pred); template<class InputIterator, class T> InputIterator find(InputIterator first, InputIterator last, const T& value); Given that I am used to how the STL works, at first I'd expect Boost.Algorithm to follow the same interface where it makes sense-- why would none_of_equal use an interface different from std::find? Now, if truly the STL interface is incorrect and Boost.Algorithm needs to use iterator_traits::value, wouldn't it make sense to also change the STL interface in the C++ standard? I think what I am trying to ask is: Does anyone know why std::find does not use use iterator_traits::value? BTW, I find none_equal /any_equal / all_equal / one_equal more readable than none_of_equal / any_of_equal / all_of_equal / one_of_equal. I'd keep the "of" postfix only for the predicate versions as the STL does (and similarly to find_if). What do you think? Thanks, --Lorenzo -- View this message in context: http://boost.2283326.n4.nabble.com/Boost-Algorithm-design-question-tp3876424... Sent from the Boost - Dev mailing list archive at Nabble.com.

on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
From the standard:
// 25.2, non-modifying sequence operations: template <class InputIterator, class Predicate> bool none_of(InputIterator first, InputIterator last, Predicate pred); template<class InputIterator, class T> InputIterator find(InputIterator first, InputIterator last, const T& value);
Given that I am used to how the STL works, at first I'd expect Boost.Algorithm to follow the same interface where it makes sense-- why would none_of_equal use an interface different from std::find? Now, if truly the STL interface is incorrect and Boost.Algorithm needs to use iterator_traits::value, wouldn't it make sense to also change the STL interface in the C++ standard?
Yes, it would make sense. No, we won't do it, because it would break code.
I think what I am trying to ask is: Does anyone know why std::find does not use use iterator_traits::value?
IIUC it was an oversight, and IIRC Stepanov's later work (EOP) does restrict the type of "value". -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Oct 6, 2011, at 8:30 AM, Dave Abrahams wrote:
on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
From the standard:
// 25.2, non-modifying sequence operations: template <class InputIterator, class Predicate> bool none_of(InputIterator first, InputIterator last, Predicate pred); template<class InputIterator, class T> InputIterator find(InputIterator first, InputIterator last, const T& value);
Given that I am used to how the STL works, at first I'd expect Boost.Algorithm to follow the same interface where it makes sense-- why would none_of_equal use an interface different from std::find? Now, if truly the STL interface is incorrect and Boost.Algorithm needs to use iterator_traits::value, wouldn't it make sense to also change the STL interface in the C++ standard?
Yes, it would make sense. No, we won't do it, because it would break code.
I think what I am trying to ask is: Does anyone know why std::find does not use use iterator_traits::value?
IIUC it was an oversight, and IIRC Stepanov's later work (EOP) does restrict the type of "value".
I could be misremembering, but didn't iterator_traits come after the first release of the STL? -- Marshall Marshall Clow Idio Software <mailto:mclow.lists@gmail.com> A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

On 10/6/2011 8:51 AM, Marshall Clow wrote:
On Oct 6, 2011, at 8:30 AM, Dave Abrahams wrote:
on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
I think what I am trying to ask is: Does anyone know why std::find does not use use iterator_traits::value?
IIUC it was an oversight, and IIRC Stepanov's later work (EOP) does restrict the type of "value".
I could be misremembering, but didn't iterator_traits come after the first release of the STL?
Marshall, I don't see how it could. Without iterator_traits, raw pointers couldn't be valid iterators, and that was an important part of Stepanov's vision. -- Eric Niebler BoostPro Computing http://www.boostpro.com

on Thu Oct 06 2011, Eric Niebler <eric-AT-boostpro.com> wrote:
On 10/6/2011 8:51 AM, Marshall Clow wrote:
On Oct 6, 2011, at 8:30 AM, Dave Abrahams wrote:
on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
I think what I am trying to ask is: Does anyone know why std::find does not use use iterator_traits::value?
IIUC it was an oversight, and IIRC Stepanov's later work (EOP) does restrict the type of "value".
I could be misremembering, but didn't iterator_traits come after the first release of the STL?
Marshall, I don't see how it could. Without iterator_traits, raw pointers couldn't be valid iterators, and that was an important part of Stepanov's vision.
Early implementations used an internal __iterator_category() function that returned a tag. It was implemented like this: template <class T> random_access_iterator_tag __iterator_category(T*); template <class T> T::iterator_category __iterator_category(T const&); I think the concept was probably weakly defined as something like: "your type either has to be a pointer, or it has to have a nested iterator_category typedef." But I could be wrong of course. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

[Marshall Clow]
So, what do people prefer (and why?): template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val ) or template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
[STL]
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
[Dave Abrahams]
Sorry, STL. The other STL's conventions are wrong. :-)
I guess it's a question of design philosophy - like how users of Loki's smart pointers say "yay policies" and users of boost/std::shared_ptr say "boo policies". The design philosophy that I'm advocating is "preserve the user's types whenever possible". Here, a range of T and a value of V are involved. #1 says "all I need is t == v to return something convertible to bool, and I'll tell you whether this is 'false' for all elements in the range." It doesn't attempt to convert T to V or V to T. It doesn't attempt to say v == t, or !=, or anything else. It passes no judgment on what t == v "means". Perhaps t == v does attempt to convert T to V or V to T, but that's up to the types involved. Here's another example. Consider vector<unsigned char> and val = 0x1234UL. For all i, v[i] == 0x1234UL is false (the usual arithmetic conversions widen unsigned char to unsigned long). #1 will report that none of the unsigned chars are equal to 0x1234UL. But #2 smashes unsigned long to unsigned char, which must compile (possibly emitting a warning), and then the unsigned chars will be compared to 0x34. The STL isn't perfect, of course. As a slightly related example, std::multiplies<T> is deficient. It requires T to be specified up front (yet this is typically redundant, because the functor is being given to an algorithm that knows the types involved), and it has a homogeneous T op()(const T&, const T&). It would be better for the struct to be a non-template, with a templated op() (this may not have been possible in the 90s due to compiler limitations), with the signature C op()(A&&, B&&) where A&& and B&& are perfectly forwarded and C is determined through decltype. Doing this for all of the helper functors would be less verbose (no specifying T up front), potentially more efficient (no converting types before performing the operation - consider using plus on string and const char *), and more generic (now you can handle unit libraries where speed * time = distance). [STL]
Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11.
[Dave Abrahams]
/me realizes he is responsible for this change But that's *totally* different. /me fumbles around for the explanation For the version that uses "<", it's an unintentional side-effect of making the specification uniform. The function signature already took an unconstrained object by reference, and my paper was not trying to change that. It just happened to make the /semantics/ of using a type different from the value type of the sequence meaningful in that case. It also make the specification *simpler* overall and easier to understand.
Heh. The new specification is great (as a Standard Library maintainer, I *really* appreciate the clarity of thought in the C++11 wording). I'm just saying I think it's greater than you realize.
The _purpose_ was to support heterogeneity for the version that uses a function. The description in terms of partitioning is a more principled and generic way to describe the actual requirements of the algorithm. There was never any intention to promote the use of heterogeneous "<", which again, doesn't make sense.
It does make sense. For example, consider string and const char * (and it's not like I prefer const char *, it's just extremely common and users use it all the time). I could have a lexicographically sorted sequence of const char *. I have a std::string (say its contents are "meow") and I'd like to perform a lower_bound() with op<(). The sequence is partitioned with respect to e < string("meow"), so C++11 lower_bound() handles this just fine. (C++03 technically did not, and VC8/9 would complain in debug mode that the sequence was not sorted by op<(), which is why I've thought about lower_bound() a lot...) The trick here is that op<() means something very different for (const char *, const char *) than it does for (const char *, string), (string, const char *), and (string, string). STL

on Thu Oct 06 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val ) or template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
[STL]
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
[Dave Abrahams]
Sorry, STL. The other STL's conventions are wrong. :-)
I guess it's a question of design philosophy - like how users of Loki's smart pointers say "yay policies" and users of boost/std::shared_ptr say "boo policies".
Well, no. I'm trying hard to represent the views and theory of the STL's original author. I could just as easily look at it your way. But I think there's powerful and underappreciated validity to viewing algorithms as abstractions, and you can't really do that if the implementation leaks out in the specification.
The design philosophy that I'm advocating is "preserve the user's types whenever possible". Here, a range of T and a value of V are involved. #1 says "all I need is t == v to return something convertible to bool, and I'll tell you whether this is 'false' for all elements in the range." It doesn't attempt to convert T to V or V to T. It doesn't attempt to say v == t, or !=, or anything else. It passes no judgment on what t == v "means". Perhaps t == v does attempt to convert T to V or V to T, but that's up to the types involved.
Yeah, but do you see how this gets the user involved with the implementation details of the algorithm? You can get away with it with these algorithms because they're so simple, but even some simple algorithms run into problems with this way of doing things. I don't remember the details right now but it started to show up when we were working with concepts, which have the effect of (partially) check the validity of your specifications. There were algorithms that allowed all kinds of heterogeneity and conversions in their implementations, but wouldn't compile with the concept constraints given in the standard. If you tried to come up with concept constraints that actually would compile without breaking backward compatibility it turned out to be horrible. I think find_if might have been one of them. Doug Gregor knows for sure.
Here's another example. Consider vector<unsigned char> and val = 0x1234UL. For all i, v[i] == 0x1234UL is false (the usual arithmetic conversions widen unsigned char to unsigned long). #1 will report that none of the unsigned chars are equal to 0x1234UL. But #2 smashes unsigned long to unsigned char, which must compile (possibly emitting a warning), and then the unsigned chars will be compared to 0x34.
Well, this is the algorithm needing to accomodate a weakness of the core language (implicit narrowing conversions). And since we don't live in an ideal world, maybe you have a point.
[STL]
Heterogeneous comparisons are surprisingly popular, and the STL is moving to support them better. For example, std::lower_bound() didn't support them in C++03, and does in C++11.
[Dave Abrahams]
/me realizes he is responsible for this change But that's *totally* different. /me fumbles around for the explanation For the version that uses "<", it's an unintentional side-effect of making the specification uniform. The function signature already took an unconstrained object by reference, and my paper was not trying to change that. It just happened to make the /semantics/ of using a type different from the value type of the sequence meaningful in that case. It also make the specification *simpler* overall and easier to understand.
Heh. The new specification is great (as a Standard Library maintainer, I *really* appreciate the clarity of thought in the C++11 wording). I'm just saying I think it's greater than you realize.
The _purpose_ was to support heterogeneity for the version that uses a function. The description in terms of partitioning is a more principled and generic way to describe the actual requirements of the algorithm. There was never any intention to promote the use of heterogeneous "<", which again, doesn't make sense.
It does make sense. For example, consider string and const char * (and it's not like I prefer const char *, it's just extremely common and users use it all the time).
True.
I could have a lexicographically sorted sequence of const char *. I have a std::string (say its contents are "meow") and I'd like to perform a lower_bound() with op<(). The sequence is partitioned with respect to e < string("meow"), so C++11 lower_bound() handles this just fine. (C++03 technically did not, and VC8/9 would complain in debug mode that the sequence was not sorted by op<(), which is why I've thought about lower_bound() a lot...)
The trick here is that op<() means something very different for (const char *, const char *) than it does for (const char *, string), (string, const char *), and (string, string).
Yeah, that's a trick. That nonuniformity is not encouraging. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

[Dave Abrahams]
Well, no. I'm trying hard to represent the views and theory of the STL's original author. I could just as easily look at it your way. But I think there's powerful and underappreciated validity to viewing algorithms as abstractions, and you can't really do that if the implementation leaks out in the specification.
I think that algorithms with weak requirements can still be viewed as abstractions - and lower_bound() provides the best example. If I'm explaining lower_bound() to someone, I'm going to tell them that it finds (in logarithmic time) the first location in a sorted sequence where a value could be inserted while preserving the ordering. That's a very clean, abstract view of the algorithm, and it's precise enough to handle the usual corner cases (the value exists in the sequence, multiple values exist in the sequence, the value doesn't exist but would be inserted at the beginning/middle/end). And yet, lower_bound() is capable of more - which is unimportant to most users but very important to a few. It can handle weaker iterators than random-access (performing a linear number of iterator operations, but a logarithmic number of predicate applications), it can handle heterogeneous comparisons, and the sequence doesn't have to be sorted in any sense - merely partitioned with respect to an expression. The first and the third are extremely obscure, but the heterogeneous part is very popular. [STL]
The design philosophy that I'm advocating is "preserve the user's types whenever possible". Here, a range of T and a value of V are involved. #1 says "all I need is t == v to return something convertible to bool, and I'll tell you whether this is 'false' for all elements in the range." It doesn't attempt to convert T to V or V to T. It doesn't attempt to say v == t, or !=, or anything else. It passes no judgment on what t == v "means". Perhaps t == v does attempt to convert T to V or V to T, but that's up to the types involved.
[Dave Abrahams]
Yeah, but do you see how this gets the user involved with the implementation details of the algorithm?
I think of STL algorithms as performing unspecified magic, except that they want very specific operations with specific semantics from the types that I provide (iterators, elements, functors).
There were algorithms that allowed all kinds of heterogeneity and conversions in their implementations, but wouldn't compile with the concept constraints given in the standard. If you tried to come up with concept constraints that actually would compile without breaking backward compatibility it turned out to be horrible. I think find_if might have been one of them. Doug Gregor knows for sure.
I agree that it's a problem (I remember seeing some of that on the reflector), but the solution is not to require homogeneous types everywhere. [STL]
Here's another example. Consider vector<unsigned char> and val = 0x1234UL. For all i, v[i] == 0x1234UL is false (the usual arithmetic conversions widen unsigned char to unsigned long). #1 will report that none of the unsigned chars are equal to 0x1234UL. But #2 smashes unsigned long to unsigned char, which must compile (possibly emitting a warning), and then the unsigned chars will be compared to 0x34.
[Dave Abrahams]
Well, this is the algorithm needing to accomodate a weakness of the core language (implicit narrowing conversions). And since we don't live in an ideal world, maybe you have a point.
For the record, I dislike C's conversions. But if implicit narrowing were banned (or if the user enables the appropriate warning and compiles with warnings-as-errors), then #2 simply fails to compile while #1 compiles and works perfectly, because it doesn't attempt to convert the value type to the element type ahead of time. STL

On 10/7/2011 8:34 AM, Nathan Ridge wrote:
(I remember seeing some of that on the reflector)
OT: What is "the reflector"? I've seen this term a few times on this list but I'm not quite sure what it is...
For historical reasons that are obscure to me, the internal mailing lists of the standardization committee are called "reflectors". -- Eric Niebler BoostPro Computing http://www.boostpro.com

On 5 October 2011 17:53, Stephan T. Lavavej <stl@exchange.microsoft.com>wrote:
[Marshall Clow]
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val )
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val )
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
Even better, consider vector<const char *> and val being a string. Here, the element type is still directly comparable to val's type, but val's type is not implicitly convertible to the element type.
The never ending reference to std::string vs const char* performance thing. I'm simply amazed about the simplicity of other developers performance issues, if this was the main bottleneck I would ever deal with I could have more time drinking coffee =) -IF- such a case would show up in a profiler, I think it's only fair to let the developer optimize his scenario with some custom code, that to open up this can of worms for everyone else. Honestly I hadn't considered how dangerous std::find etc is until a discussion came up related to clamp() in other thread. There is a std::function call find_if ( http://www.sgi.com/tech/stl/find_if.html), let it handle the odd-ball of comparing unrelated types. I understand the rational for not changing the c++ standard, but since Boost.Algorithms is a new library with no requirements on backward compatibility, let's not redo the same mistake. - Christian

Christian Holmquist wrote:
On 5 October 2011 17:53, Stephan T. Lavavej <stl@exchange.microsoft.com>wrote:
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
The never ending reference to std::string vs const char* performance thing. I'm simply amazed about the simplicity of other developers performance issues, if this was the main bottleneck I would ever deal with I could have more time drinking coffee =)
Too much coffee is bad for you, so this is a good thing!
-IF- such a case would show up in a profiler, I think it's only fair to let the developer optimize his scenario with some custom code, that to open up this can of worms for everyone else.
In the "meow" example above, forcing the construction of a std::string from the string literal implies a free store allocation which, in turn, generally implies a global lock. That means it can be a drag on MT performance. A library function shouldn't impose that if preventing or avoiding it is practicable. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com ________________________________ IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

on Thu Oct 06 2011, "Stewart, Robert" <Robert.Stewart-AT-sig.com> wrote:
Christian Holmquist wrote:
On 5 October 2011 17:53, Stephan T. Lavavej
<stl@exchange.microsoft.com>wrote:
#1 is better. It follows the STL's conventions (e.g. look at std::find()) and permits heterogeneous comparisons.
Consider vector<string> and val being "meow". string is directly comparable to const char *, without any conversions or temporaries.
The never ending reference to std::string vs const char* performance thing. I'm simply amazed about the simplicity of other developers performance issues, if this was the main bottleneck I would ever deal with I could have more time drinking coffee =)
Too much coffee is bad for you, so this is a good thing!
-IF- such a case would show up in a profiler, I think it's only fair to let the developer optimize his scenario with some custom code, that to open up this can of worms for everyone else.
In the "meow" example above, forcing the construction of a std::string from the string literal implies a free store allocation which, in turn, generally implies a global lock. That means it can be a drag on MT performance. A library function shouldn't impose that if preventing or avoiding it is practicable.
Avoiding it is easy for the user: use the version that takes a predicate. I don't think you can evaluate these choices just by looking at the implementations. Before anyone else votes option 1, I'd like to see somebody write the description of the algorithm, including the concept requirements. With option 2, we know that == has to be an equivalence relation. The semantics in option 1 are a lot fuzzier. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
I don't think you can evaluate these choices just by looking at the implementations. Before anyone else votes option 1, I'd like to see
Well, I'm not so sure I like option 1 anymore... if option 1 was an oversight in STL then there is no sense in perpetuating the specification error in Boost.Algorithm (perhaps Boost.Algorithm should then go with option 2 and explain the rationale of why it's different than what STL does for example with std::find).
somebody write the description of the algorithm, including the concept requirements. With option 2, we know that == has to be an equivalence relation. The semantics in option 1 are a lot fuzzier.
Good point! I'll try to start :) template< typename Iter, typename Val > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type, Val> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a template< typename Iter > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type> bool none_of_equal ( Iter first, Iter last, iterator_traits<Iter>::value_type const& value ) ; // option 2 Essentially, I'd think option 1 requires operator== that satisfies has_equal_to<iterator_traits<Iter>::value_type, Val> while option 2 requires operator== that satisfies has_equal_to<iterator_traits<Iter>::value_type>. The alternative would be for option 1 to require instead: template< typename Iter, typename Val > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type>, Convertible<Val, iterator_traits<Iter>::value_type> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1b What do you think? Thanks, --Lorenzo -- View this message in context: http://boost.2283326.n4.nabble.com/Boost-Algorithm-design-question-tp3876424... Sent from the Boost - Dev mailing list archive at Nabble.com.

On Oct 6, 2011, at 11:05 AM, lcaminiti wrote:
Dave Abrahams wrote:
somebody write the description of the algorithm, including the concept requirements. With option 2, we know that == has to be an equivalence relation. The semantics in option 1 are a lot fuzzier.
Good point! I'll try to start :)
template< typename Iter, typename Val > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type, Val> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a
That seems right to me.
template< typename Iter > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type> bool none_of_equal ( Iter first, Iter last, iterator_traits<Iter>::value_type const& value ) ; // option 2
I think you're missing the requirement is that the value passed in be convertible to "iterator_traits<Iter>::value_type", since that is what will happen at the call site. Given: std::vector<int> v ; none_of_equal ( v.begin (), v.end (), 3.2 ); you need to capture the fact that the floating point number 3.2 has to be convertible to 'int'. [ But maybe that's "external" to the function definition ]
Essentially, I'd think option 1 requires operator== that satisfies has_equal_to<iterator_traits<Iter>::value_type, Val>
Yes.
while option 2 requires operator== that satisfies has_equal_to<iterator_traits<Iter>::value_type>.
Ok - but that's not sufficient. -- Marshall Marshall Clow Idio Software <mailto:mclow.lists@gmail.com> A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

on Thu Oct 06 2011, Marshall Clow <mclow.lists-AT-gmail.com> wrote:
On Oct 6, 2011, at 11:05 AM, lcaminiti wrote:
Dave Abrahams wrote:
somebody write the description of the algorithm, including the concept requirements. With option 2, we know that == has to be an equivalence relation. The semantics in option 1 are a lot fuzzier.
^^^^^^^^^
Good point! I'll try to start :)
template< typename Iter, typename Val > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type, Val> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a
That seems right to me.
It seems right to you because everybody overlooks the semantic part of concept requirements, which he left out. What does the algorithm *do*? What are the semantic constraints imposed by EqualityComparable<T,U> that make the algorithm do *that*? When T==U then you say EqualityComparable *is an equivalence relation*, and you say the algorithm "Returns true iff no member of [ first, last ) is equal to val." You don't have to say anything about which one goes on the left or right side of the ==. You can understand the semantics of that algorithm in terms that mean something more than a pile of syntax. I don't believe any description of the algorithm where T!=U can be as clear or as meaningful. Algorithms that use operators should be specified in terms of the meanings of those operators rather than the expressions used to implement the algorithms. We have the versions taking functions for the other cases. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

on Thu Oct 06 2011, Marshall Clow <mclow.lists-AT-gmail.com> wrote:
template< typename Iter > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type> bool none_of_equal ( Iter first, Iter last, iterator_traits<Iter>::value_type const& value ) ; // option 2
I think you're missing the requirement is that the value passed in be convertible to "iterator_traits<Iter>::value_type", since that is what will happen at the call site.
No! <sorry, I'm getting excited> If I write a function that takes an int, do I have to specify that you can pass "anything convertible to an int?" No! The same applies f when you write a function that takes a reference to the iterator's value_type. The function signature clearly says what is happening. The fact that a conversion can happen is not the algorithm author's problem when written this way. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
Dave Abrahams wrote:
I don't think you can evaluate these choices just by looking at the implementations. Before anyone else votes option 1, I'd like to see
Well, I'm not so sure I like option 1 anymore... if option 1 was an oversight in STL then there is no sense in perpetuating the specification error in Boost.Algorithm (perhaps Boost.Algorithm should then go with option 2 and explain the rationale of why it's different than what STL does for example with std::find).
somebody write the description of the algorithm, including the concept requirements. With option 2, we know that == has to be an equivalence relation. The semantics in option 1 are a lot fuzzier.
Good point! I'll try to start :)
template< typename Iter, typename Val > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type, Val> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a
What you failed to do here was to describe the semantic requirements on EqualityComparable<A,B>, where A != B. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
Dave Abrahams wrote:
I don't think you can evaluate these choices just by looking at the implementations. Before anyone else votes option 1, I'd like to see
Well, I'm not so sure I like option 1 anymore... if option 1 was an oversight in STL then there is no sense in perpetuating the specification error in Boost.Algorithm (perhaps Boost.Algorithm should then go with option 2 and explain the rationale of why it's different than what STL does for example with std::find).
somebody write the description of the algorithm, including the concept requirements. With option 2, we know that == has to be an equivalence relation. The semantics in option 1 are a lot fuzzier.
Good point! I'll try to start :)
template< typename Iter, typename Val > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type, Val> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a
What you failed to do here was to describe the semantic requirements on EqualityComparable<A,B>, where A != B.
I see, how about the semantic of EqualityComparable<A, B> with A != B is up for discussion :) Here's a (funny) way I could satisfy the EqualityComparable<A, B> requirement: #include <vector> #include <algorithm> struct orange { int weight; }; struct apple { int weight; }; bool operator== ( orange const& l, apple const& r ) { return l.weight == r.weight; } // comparing apples with oranges?! int main ( ) { std::vector<orange> v(2); std::find(v.begin(), v.end(), apple()); return 0; } No type conversion, EqualityComparable<A, B> is some sort of equivalence that can be defined among the different types without converting them into one another... isn't that confusing :) I am starting to find option 1 confusing... (like the example above). Maybe is best to go with option 2 for the _equal functions requiring the types to be the same because that's the commonly understood semantic of ==. For comparing apples with oranges (and other strange things), we can use the predicate version of the algorithms. --Lorenzo -- View this message in context: http://boost.2283326.n4.nabble.com/Boost-Algorithm-design-question-tp3876424... Sent from the Boost - Dev mailing list archive at Nabble.com.

bool operator== ( orange const& l, apple const& r ) Â Â { return l.weight == r.weight; } // comparing apples with oranges?!
The expression "like comparing apples and oranges" is not usually meant to inspire confidence in the act of comparison. Just because it's legal doesn't make it good. A definition of equality should support equational reasoning. If two values are equal, then you should be able to substitute one for the other and produce an equivalent program. I seriously doubt that substituting an apple in place of an orange would produce an equivalent program; it will probably result in type errors. Objects of different types cannot be equal (with some exceptions). Even if apples and oranges were structurally identical, your operator does not define equality. What about color? Could I substitute an orange of equal weight in the following program to produce a correct program? apple x = pick_fruit(tree); assert(x.color != orange); I suspect not. Your apples and may equivalent weights, but they are not equal. Equivalence is more general than equality and is not sufficient to guarantee substitutability. I suppose it doesn't really matter if you're just trying to make your program look neat, but if you have to formally verify your system, you'd better nail down the semantics. As far as I know there are no systems of logic that have, in their basis, equality defined for objects of different types. The correct way to write the comparison is: weight(a) == weight(o) Assuming that weights are numeric, I'm reasonably certain that there's an operator that conforms to the ideal. Even if the results of these operations are typed quantities (1kg vs 1/4lb), there's a principle that allows correct and semantically meaningful comparison: both can be converted to a common type and compared. Comparing apples and oranges remains as meaningless an operation as the idiom suggests.

Andrew Sutton-3 wrote:
bool operator== ( orange const& l, apple const& r ) Â Â { return l.weight == r.weight; } // comparing apples with oranges?!
The expression "like comparing apples and oranges" is not usually meant to inspire confidence in the act of comparison. Just because it's legal doesn't make it good.
A definition of equality should support equational reasoning. If two values are equal, then you should be able to substitute one for the other and produce an equivalent program. I seriously doubt that substituting an apple in place of an orange would produce an equivalent program; it will probably result in type errors. Objects of different types cannot be equal (with some exceptions).
Yes, my example was hinting that even if we can syntactically define orange == apple maybe we shouldn't because its semantics will be obscure. Here's an example (that compiles) to illustrate both option 1 and option 2 of the interface: #include <boost/local/function.hpp> #include <vector> #include <algorithm> #include <iostream> #include <iterator> struct orange { int weight; int c_vitamin; }; struct apple { int weight; }; bool operator== ( orange const& l, apple const& r ) { return l.weight == r.weight; // confusing semantic?? } bool operator== ( orange const& l, orange const& r ) { return l.weight == r.weight && l.c_vitamin == r.c_vitamin; } template< typename Iter, typename Value > // requires InputIterator<Iter>, // EqualityComparable<iterator_traits<Iter>::value_type, Value> Iter find_opt1( Iter first, Iter last, Value const& val ) { return std::find(first, last, val); } template< typename Iter > // requires InputIterator<Iter> // EqualityComparable<iterator_traits<Iter>::value_type> Iter find_opt2( Iter first, Iter last, typename std::iterator_traits<Iter>::value_type const& val ) { return std::find(first, last, val); } int main ( ) { std::vector<orange> v(2); apple a; orange o; find_opt1(v.begin(), v.end(), a); // OK but does it make sense? find_opt2(v.begin(), v.end(), o); // OK and clear, we're comparing orange with oranges // find_opt2(v.begin(), v.end(), apple()); // correctly, compiler error // can still compare just weights with predicate version: bool BOOST_LOCAL_FUNCTION_PARAMS( orange const& l, const bind& a ) { return l.weight == a.weight; } BOOST_LOCAL_FUNCTION_NAME(same_weight) std::find_if(v.begin(), v.end(), same_weight); // clear, we're comparing just weights return 0; } Again, I'm thinking that the semantics of option 2 are more natural (because it is confusing to be able to compare apples with oranges) but IMO Boost.Algorithm should clearly explain why it is deviating from STL if it adopts option 2 instead of option 1. P.S. Are the < > still messy? Thanks, --Lorenzo -- View this message in context: http://boost.2283326.n4.nabble.com/Boost-Algorithm-design-question-tp3876424... Sent from the Boost - Dev mailing list archive at Nabble.com.

on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
I see, how about the semantic of EqualityComparable<A, B> with A != B is up for discussion :)
Lorenzo, is there anything you can do to get your mail client to put in the usual <, > characters instead of these HTML entities? It's making a mess of things. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
I don't think you can evaluate these choices just by looking at the implementations. Before anyone else votes option 1, I'd like to see somebody write the description of the algorithm, including the concept requirements.
Well we can start by copying what the experts wrote for std::find, can't we? For the non-predicate version it just says "Returns: The first iterator i in the range [first,last) for which *i==value holds" (n3242). none_of_equal could be written in the same style. But you seem to want something different than that; I guess that you want to add a requirement on the type of the value, right? If so, why do you want that even though it wasn't deemed necessary for the standard? But in any case, it's not hard to write; just require that T is a type for which *i==value is valid. What am I missing? Regards, Phil.

on Thu Oct 06 2011, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
Dave Abrahams wrote:
I don't think you can evaluate these choices just by looking at the implementations. Before anyone else votes option 1, I'd like to see somebody write the description of the algorithm, including the concept requirements.
Well we can start by copying what the experts wrote for std::find, can't we?
Who are "the experts?" The specification that we have in the standard library is not current best practice when it comes to Generic Programming. I'm not sure who crafted the language, but it's not what we should do today.
For the non-predicate version it just says "Returns: The first iterator i in the range [first,last) for which *i==value holds" (n3242).
Exactly, a pile of syntax.
none_of_equal could be written in the same style. But you seem to want something different than that; I guess that you want to add a requirement on the type of the value, right?
No, I want to avoid a requirement by encoding it into the function signature.
If so, why do you want that even though it wasn't deemed necessary for the standard?
See above.
But in any case, it's not hard to write; just require that T is a type for which *i==value is valid.
What am I missing?
You're missing that == should *mean* something. You're not even requiring it to be symmetric. Why is it right to test *i==value instead of value==*i? on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
Dave Abrahams wrote:
on Thu Oct 06 2011, lcaminiti <lorcaminiti-AT-gmail.com> wrote:
Dave Abrahams wrote:
I don't think you can evaluate these choices just by looking at the implementations. Before anyone else votes option 1, I'd like to see
Well, I'm not so sure I like option 1 anymore... if option 1 was an oversight in STL then there is no sense in perpetuating the specification error in Boost.Algorithm (perhaps Boost.Algorithm should then go with option 2 and explain the rationale of why it's different than what STL does for example with std::find).
somebody write the description of the algorithm, including the concept requirements. With option 2, we know that == has to be an equivalence relation. The semantics in option 1 are a lot fuzzier.
Good point! I'll try to start :)
template< typename Iter, typename Val > requires InputIterator<Iter>, EqualityComparable<iterator_traits<Iter>::value_type, Val> bool none_of_equal ( Iter first, Iter last, Val const& val ) ; // option 1a
What you failed to do here was to describe the semantic requirements on EqualityComparable<A,B>, where A != B.
I see, how about the semantic of EqualityComparable<A, B> with A != B is up for discussion :)
Here's a (funny) way I could satisfy the EqualityComparable<A, B> requirement:
#include <vector> #include <algorithm>
struct orange { int weight; };
struct apple { int weight; };
bool operator== ( orange const& l, apple const& r ) { return l.weight == r.weight; } // comparing apples with oranges?!
int main ( ) { std::vector<orange> v(2); std::find(v.begin(), v.end(), apple()); return 0; }
No type conversion, EqualityComparable<A, B> is some sort of equivalence that can be defined among the different types without converting them into one another... isn't that confusing :)
I am starting to find option 1 confusing... (like the example above). Maybe is best to go with option 2 for the _equal functions requiring the types to be the same because that's the commonly understood semantic of ==. For comparing apples with oranges (and other strange things), we can use the predicate version of the algorithms.
--Lorenzo
-- View this message in context: http://boost.2283326.n4.nabble.com/Boost-Algorithm-design-question-tp3876424... Sent from the Boost - Dev mailing list archive at Nabble.com.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Dave Abrahams BoostPro Computing http://www.boostpro.com

[Phil Endecott]
But in any case, it's not hard to write; just require that T is a type for which *i==value is valid.
What am I missing?
[Dave Abrahams]
You're missing that == should *mean* something.
Why should the algorithm assume any more meaning than it has to? For example, std::equal() does not require symmetry. This is just fine for people comparing a range of T to a range of T, where T == T is symmetric, and makes people comparing a range of X to a range of Y, where X == Y is provided and Y == X has not even been implemented (and X can't be converted to/from Y), super happy.
You're not even requiring it to be symmetric. Why is it right to test *i==value instead of value==*i?
The algorithm takes (first, last, value), putting the range on the left and the value on the right. STL

on Thu Oct 06 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:
[Phil Endecott]
But in any case, it's not hard to write; just require that T is a type for which *i==value is valid.
What am I missing?
[Dave Abrahams]
You're missing that == should *mean* something.
Why should the algorithm assume any more meaning than it has to?
So that it can guarantee some meaning in the result.
For example, std::equal() does not require symmetry.
The standard specifications should not necessarily be used as a model. They're kind of a mess, and inconsistent.
This is just fine for people comparing a range of T to a range of T, where T == T is symmetric, and makes people comparing a range of X to a range of Y, where X == Y is provided and Y == X has not even been implemented (and X can't be converted to/from Y), super happy.
Yeah, but the specification for std::equal doesn't say that it returns true iff the sequences are equal. And it should. Right now you have to infer that from a bunch of C++ expressions (bits of the implementation which leak out into the specification) and your knowledge about the types involved. It's not a huge leap in this case, but it hurts the ability to think about algorithms at the abstract level, and it's unsustainable as the algorithms grow more complex. Try doing that with is_permutation. Now if you look at the specification for std::search, it says "finds a subsequence of equal values in a sequence." That's awesome! But it can't deliver that unless the value types' "==" operator actually means equality, and that requirement is nowhere to be found. So you might say that this specification imposes the /implicit/ requirement that "==" means equality. And if you look at "includes" in C++11 you'll see ,---- | 1. `true' if is empty or if every element in the range [first1, last1) | | is contained in the range [first2, last2) . Returns `false' otherwise. | | 2. At most `2 * ((last1 - first1) + (last2 - first2)) - 1' comparisons. `---- and that's it. This I like, but it only works if there's an implicit assumption that == means true equality.
You're not even requiring it to be symmetric. Why is it right to test *i==value instead of value==*i?
The algorithm takes (first, last, value), putting the range on the left and the value on the right.
OK, I guess that's one kind of logic. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

[STL]
Why should the algorithm assume any more meaning than it has to?
[Dave Abrahams]
So that it can guarantee some meaning in the result.
That's necessary for something like std::sort(), which needs op<() to have "reasonable" properties. Something like std::lower_bound (and none_of_equal) can and should have extremely weak requirements. [STL]
For example, std::equal() does not require symmetry.
[Dave Abrahams]
The standard specifications should not necessarily be used as a model. They're kind of a mess, and inconsistent.
Sure - but in this case, std::equal()'s behavior is reasonable.
Yeah, but the specification for std::equal doesn't say that it returns true iff the sequences are equal. And it should.
You're trying to impose higher-level meaning on the algorithm. Indeed, that's what it's usually used for, but algorithms are more powerful when phrased with weaker requirements. Weakness is strength!
Try doing that with is_permutation.
The phrasing for that would be tricky but the intent would be clear. Remove "Requires:: ForwardIterator1 and ForwardIterator2 shall have the same value type. The comparison function shall be an equivalence relation.", then say that is_permutation(ForwardIterator1 first1, ForwardIterator1 last1, ForwardIterator2 first2) returns true iff a bijection exists from elem1 in [first1, last1) to elem2 in [first2, first2 + (last1 - first1)) with elem1 == elem2. Similarly for pred(elem1, elem2). In fact, the algorithm is misnamed - it should be is_bijection(). The current rules forbid asking whether a vector<const char *> is a "permutation" of a vector<string>. But the actual algorithm is totally fine with that.
Now if you look at the specification for std::search, it says "finds a subsequence of equal values in a sequence." That's awesome!
It would be better as an informative note - the requirements in [alg.search]/2 are already perfect. (In particular, they say X == Y and do not require Y == X to compile.) STL

on Thu Oct 06 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:
Yeah, but the specification for std::equal doesn't say that it returns true iff the sequences are equal. And it should.
You're trying to impose higher-level meaning on the algorithm.
Exactly... no, wait... well, if you believe algorithms are abstractions then it already had higher-level meaning. An algorithm is not its implementation in C++.
Indeed, that's what it's usually used for, but algorithms are more powerful when phrased with weaker requirements. Weakness is strength!
Try doing that with is_permutation.
The phrasing for that would be tricky but the intent would be clear. Remove "Requires:: ForwardIterator1 and ForwardIterator2 shall have the same value type. The comparison function shall be an equivalence relation.", then say that is_permutation(ForwardIterator1 first1, ForwardIterator1 last1, ForwardIterator2 first2) returns true iff a bijection exists from elem1 in [first1, last1) to elem2 in [first2, first2 + (last1 - first1)) with elem1 == elem2.
Exactly, the phrasing gets worse. Your phrasing isn't even quite right, but it's close. Actually, I don't know how to phrase it. operator== isn't a bijection. You have to say something like for each a in [first1,last1) there exists exactly one b in [first2,last2) such that a == b. But then you don't get to say "bijection" :-)
[first2, first2 + (last1 - first1)) with elem1 == elem2
Similarly for pred(elem1, elem2). In fact, the algorithm is misnamed - it should be is_bijection().
Shouldn't is_bijection be a predicate on a function? I think "bijection exists" would be closer, but that's not even quite right. The fact that you can't come up with a good name for this broader function should tell you something.
The current rules forbid asking whether a vector<const char *> is a "permutation" of a vector<string>. But the actual algorithm is totally fine with that.
Yeah, and that was the basis on which I expanded lower_bound et al.
Now if you look at the specification for std::search, it says "finds a subsequence of equal values in a sequence." That's awesome!
It would be better as an informative note - the requirements in [alg.search]/2 are already perfect. (In particular, they say X == Y and do not require Y == X to compile.)
-- Dave Abrahams BoostPro Computing http://www.boostpro.com

on Fri Oct 07 2011, "Stephan T. Lavavej" <stl-AT-exchange.microsoft.com> wrote:
[Dave Abrahams]
The fact that you can't come up with a good name for this broader function should tell you something.
Yes - that I got out of math and went into CS for a reason. :->
(And then I escaped theoretical CS and became a programmer...)
I think maybe it should tell you that it doesn't represent an abstraction you understand. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
For the non-predicate version it just says "Returns: The first iterator i in the range [first,last) for which *i==value holds" (n3242).
Exactly, a pile of syntax.
What specifically is wrong with it?
none_of_equal could be written in the same style. But you seem to want something different than that; I guess that you want to add a requirement on the type of the value, right?
No, I want to avoid a requirement by encoding it into the function signature.
How would you do that? The function signature can't encode the requirement that *i == value should work, or...
You're missing that == should *mean* something.
... mean something, or...
You're not even requiring it to be symmetric.
... that it's equivalent to value == *i. You still need to add these requirements. Sure, you can call them "value_type shall be EqualityComparable" and even then, you'll have to spell "*i == value" when describing the semantics, either directly or by saying something like "finds the first *i that equals value", which, if we're being really pedantic, doesn't really specify anything, because "x equals y" is not defined anywhere. Bottom line, you'll add a conversion and a bunch of unnecessary requirements and end up with the exact same specification as before, namely, returns the first i for which *i == value.

on Thu Oct 06 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Dave Abrahams wrote:
none_of_equal could be written in the same style. But you seem to want something different than that; I guess that you want to add a requirement on the type of the value, right?
No, I want to avoid a requirement by encoding it into the function signature.
How would you do that? The function signature can't encode the requirement that *i == value should work, or...
It encodes the requirement that the two things we're comparing have the same type.
You're missing that == should *mean* something.
... mean something, or...
You're not even requiring it to be symmetric.
... that it's equivalent to value == *i. You still need to add these requirements. Sure, you can call them "value_type shall be EqualityComparable" and even then, you'll have to spell "*i == value" when describing the semantics, either directly or by saying something like "finds the first *i that equals value",
Yes!
which, if we're being really pedantic, doesn't really specify anything, because "x equals y" is not defined anywhere.
Only if we're being standards-pedantic. If we're being mathematical and using commonly accepted definitions of terms, it is perfectly well-specified.
Bottom line, you'll add a conversion and a bunch of unnecessary requirements and end up with the exact same specification as before, namely, returns the first i for which *i == value.
I claim that "finds the first element in [first, last) that equals value" is a better way to say it. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
Bottom line, you'll add a conversion and a bunch of unnecessary requirements and end up with the exact same specification as before, namely, returns the first i for which *i == value.
I claim that "finds the first element in [first, last) that equals value" is a better way to say it.
I know, and I question this claim. Why is it better, when it says the exact same thing, while needing to impose unnecessary requirements to arrive at it? The current specification (ignoring the conversion issue for now, as it's separate) says: - when T is EqualityComparable, returns the first element that equals value; - otherwise, returns the first element *i for which *i == value. Your preferred replacement says: - when T is EqualityComparable, returns the first element that equals value; - otherwise, the behavior is undefined. Do tell me why it's better.

on Fri Oct 07 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Dave Abrahams wrote:
Bottom line, you'll add a conversion and a bunch of unnecessary
requirements and end up with the exact same specification as before, namely, returns the first i for which *i == value.
I claim that "finds the first element in [first, last) that equals value" is a better way to say it.
I know, and I question this claim. Why is it better, when it says the exact same thing, while needing to impose unnecessary requirements to arrive at it?
The current specification (ignoring the conversion issue for now, as it's separate) says:
- when T is EqualityComparable, returns the first element that equals value; - otherwise, returns the first element *i for which *i == value.
Your preferred replacement says:
- when T is EqualityComparable, returns the first element that equals value; - otherwise, the behavior is undefined.
Do tell me why it's better.
Well, if you phrase it that way, it's a little different. The original description only had the "otherwise" clause, right? The "replacement" is still simpler, because you wouldn't actually write the "otherwise ..." clause; you just say it requires EqualityComparable. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
on Fri Oct 07 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote: ...
The current specification (ignoring the conversion issue for now, as it's separate) says:
- when T is EqualityComparable, returns the first element that equals value; - otherwise, returns the first element *i for which *i == value.
Your preferred replacement says:
- when T is EqualityComparable, returns the first element that equals value; - otherwise, the behavior is undefined.
Do tell me why it's better.
Well, if you phrase it that way, it's a little different. The original description only had the "otherwise" clause, right?
Yes. The "otherwise" clause implies the preceding clause when T is EqualityComparable. I wrote it that way to highlight the fact that the two specifications only differ in the "otherwise" part.
The "replacement" is still simpler, because you wouldn't actually write the "otherwise ..." clause; you just say it requires EqualityComparable.
It may be simpler, but why is it better? People who operate on properly EqualityComparable types see no difference between the two, whereas people whose op== isn't, for some unfathomable reason, an equivalence relation still get the expected result from the first and undefined behavior from the second.

on Fri Oct 07 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Dave Abrahams wrote:
on Fri Oct 07 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote: ...
The current specification (ignoring the conversion issue for now, as
it's separate) says:
- when T is EqualityComparable, returns the first element that equals > value; - otherwise, returns the first element *i for which *i == value.
Your preferred replacement says:
- when T is EqualityComparable, returns the first element that equals > value; - otherwise, the behavior is undefined.
Do tell me why it's better.
Well, if you phrase it that way, it's a little different. The original description only had the "otherwise" clause, right?
Yes. The "otherwise" clause implies the preceding clause when T is EqualityComparable. I wrote it that way to highlight the fact that the two specifications only differ in the "otherwise" part.
Yes, they differ in *consequence* only in the "otherwise" part. But you would never write the first one that way since the first part is logically redundant and thus needlessly complicating, and you would never write the second one that way for reasons already stated. It's a question of emphasis. The first one emphasizes implementation details and the second one emphasizes the abstraction represented by the algorithm. [All that said, I see your point; I'm not rigid about this... but, do you see mine?]
The "replacement" is still simpler, because you wouldn't actually write the "otherwise ..." clause; you just say it requires EqualityComparable.
It may be simpler, but why is it better?
Err,... because it's simpler? ;-) If you take this to an extreme, you end up exposing the entire implementation of the algorithm *as* its specification, and telling the user, "if you want to know what this does on a set of models of the usual concepts, figure it out yourself."
People who operate on properly EqualityComparable types see no difference between the two, whereas people whose op== isn't, for some unfathomable reason, an equivalence relation still get the expected result from the first and undefined behavior from the second.
I agree that the use cases being described are interesting, intuitive, and useful, and I have a hard time with the idea that we shouldn't support them. Maybe it makes sense to start writing things in the way you have above, so that the pure concepts are redundantly handled in the specification. But I'm still very uneasy about any specification that looks like it's exposing the implementation details. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

on Fri Oct 07 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Dave Abrahams wrote:
Bottom line, you'll add a conversion and a bunch of unnecessary
requirements and end up with the exact same specification as before, namely, returns the first i for which *i == value.
I claim that "finds the first element in [first, last) that equals value" is a better way to say it.
I know, and I question this claim. Why is it better, when it says the exact same thing, while needing to impose unnecessary requirements to arrive at it?
The current specification (ignoring the conversion issue for now, as it's separate) says:
- when T is EqualityComparable, returns the first element that equals value; - otherwise, returns the first element *i for which *i == value.
Your preferred replacement says:
- when T is EqualityComparable, returns the first element that equals value; - otherwise, the behavior is undefined.
Do tell me why it's better.
Here's another reason why the first formulation might not be such a hot idea: it rules out some obvious implementations that really ought to be OK. For example, template <class InputIterator, class T> InputIterator find(InputIterator i, InputIterator j, T value) { while (i != j && !(*i == value)) ++i; return i; } (this is essentially equivalent to the implementation of find in the original STL). -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
Here's another reason why the first formulation might not be such a hot idea: it rules out some obvious implementations that really ought to be OK. For example,
template <class InputIterator, class T> InputIterator find(InputIterator i, InputIterator j, T value) { while (i != j && !(*i == value)) ++i; return i; }
Yes, but this formulation doesn't need *i and value to be of the same type or operator== to be an equivalence relation. It just needs operator== to return something bool-ish. I've no problem with this requirement.

on Sat Oct 29 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Dave Abrahams wrote:
Here's another reason why the first formulation might not be such a hot idea: it rules out some obvious implementations that really ought to be OK. For example,
template <class InputIterator, class T> InputIterator find(InputIterator i, InputIterator j, T value) { while (i != j && !(*i == value)) ++i; return i; }
Yes, but this formulation doesn't need *i and value to be of the same type or operator== to be an equivalence relation. It just needs operator== to return something bool-ish.
Actually it needs a bit more than that: applying ! to your bool-ish thing needs to be unambiguously &&-able with the result of i != j, which might not itself be bool. These are the kinds of problems that crop up when you start checking constraints on valid expressions. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
Actually it needs a bit more than that: applying ! to your bool-ish thing needs to be unambiguously &&-able with the result of i != j, which might not itself be bool.
That's part of what bool-ish means. For two bool-ish values x and y, x&&y should be bool-ish and (x&&y?true:false) == (x?true:false)&&(y?true:false) and similarly for ! and ||. (Note that x and y need not be of the same type.) I don't mind requiring a bool return for op== though, to avoid these gymnastics. :-)

on Sun Oct 30 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Dave Abrahams wrote:
Actually it needs a bit more than that: applying ! to your bool-ish thing needs to be unambiguously &&-able with the result of i != j, which might not itself be bool.
That's part of what bool-ish means. For two bool-ish values x and y, x&&y should be bool-ish and
(x&&y?true:false) == (x?true:false)&&(y?true:false)
I think you're missing the point, which is: When the specification for an algorithm exposes the actual expressions used in a given implementation of an algorithm, it tends to lock down the implementation in ways that don't necessarily make sense (and can prevent optimizations). Tiny details that one usually doesn't consider, like whether part of the expression is an lvalue or an rvalue, const or non-const, become encoded into the specification of the algorithm. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
I think you're missing the point, which is:
When the specification for an algorithm exposes the actual expressions used in a given implementation of an algorithm, it tends to lock down the implementation in ways that don't necessarily make sense (and can prevent optimizations). Tiny details that one usually doesn't consider, like whether part of the expression is an lvalue or an rvalue, const or non-const, become encoded into the specification of the algorithm.
None of this has anything to do with the fact that you want to require operator== to be an equivalence relation, and it doesn't have to be.

None of this has anything to do with the fact that you want to require operator== to be an equivalence relation, and it doesn't have to be.
Realistically though, when is == not (at least) an equivalence relation? Should we expect people to start overloading == to mean <? Seriously, what's the real harm in more strongly connecting syntax to semantics?

Andrew Sutton wrote:
None of this has anything to do with the fact that you want to require operator== to be an equivalence relation, and it doesn't have to be.
Realistically though, when is == not (at least) an equivalence relation?
Realistically, and in this context, when the two sides are not of the same type.

Peter Dimov wrote:
Andrew Sutton wrote:
None of this has anything to do with the fact that you want to require operator== to be an equivalence relation, and it doesn't have to be.
Realistically though, when is == not (at least) an equivalence relation?
Realistically, and in this context, when the two sides are not of the same type.
In most of the cases that I care about, there is some underlying type like "string" or "integer", on which an equivalence relation exists in some formal sense. Then there are some concrete C++ types like std::basic_string<char,allocator_1> or std::basic_string<char,allocator_2> or int32_t or int64_t. operator== is defined on pairs of these concrete types in some way that approximates to the equivalence relation on the underlying formal type, but with some inevitable flakiness at the edges, such as comparison between two char*s or comparison between integers with different numbers of bits. The difficulty, which is perhaps what Dave is grappling with, is how to specify an algorithm that does the "right thing" if-and-only-if the types do the "right thing", while remaining vague about what the "right thing" means to allow the implementation flexibility. Regards, Phil.

Realistically though, when is == not (at least) an equivalence relation?
Realistically, and in this context, when the two sides are not of the same type.
In most of the cases that I care about, there is some underlying type like "string" or "integer", on which an equivalence relation exists in some formal sense. Â Then there are some concrete C++ types like std::basic_string<char,allocator_1> or std::basic_string<char,allocator_2> or int32_t or int64_t. Â operator== is defined on pairs of these concrete types in some way that approximates to the equivalence relation on the underlying formal type, but with some inevitable flakiness at the edges, such as comparison between two char*s or comparison between integers with different numbers of bits.
Sure, but these types are inherently related. It should be reasonable to write the semantics of comparisons on related types in terms of the underlying type (or more abstract type, I guess). I can convince myself that those operations on those types have precise semantics: they define an equivalence relation. We attach meaning to symbols. I doubt that many people on this list would read the expression "a == b" as "a and b are operated on by some function with the name == that has some result". I tend to read it as "a is equal to b". If I read it that way, I happen to know something extra about the operator; it's an equivalence relation. I also happen to know something about the values: an expression involving a will yield the same result as the same expression involving b (with some notable exceptions). This additional knowledge seems like a good thing. I don't see how attaching specific meaning to operations impacts implementation flexibility. It certainly impacts design choices and choices about algorithm usage, but I hope in a positive way.

On 1 November 2011 09:43, Andrew Sutton <asutton.list@gmail.com> wrote:
We attach meaning to symbols. I doubt that many people on this list would read the expression "a == b" as "a and b are operated on by some function with the name == that has some result".
That really depends on the underlying types.
I tend to read it as "a is equal to b".
I don't read: bind(&foo, _1) == 2 as "the object which bind returns is equal to 2"; operator== on lambda-type objects is not an equivalence relation but instead returns "some result" (that happens to be a proxy for possibly evaluating an equivalence relation later). --  Nevin ":-)" Liber <mailto:nevin@eviloverlord.com> (847) 691-1404

We attach meaning to symbols. I doubt that many people on this list would read the expression "a == b" as "a and b are operated on by some function with the name == that has some result".
That really depends on the underlying types.
It should not. Equality should mean equality. I don't see why this should be so distasteful.
bind(&foo, _1) == 2
as "the object which bind returns is equal to 2"; operator== on lambda-type objects is not an equivalence relation but instead returns "some result" (that happens to be a proxy for possibly evaluating an equivalence relation later).
I think if you read this using std::bind I think its actually interpreted as its written: "the bound unary function is equal to 2". Not only is that meaningless, it should be a compiler error. Refusing to legitimize meaningless or ambiguously interpreted syntax is, I tend to think, a good thing. Obviously, you can hijack the syntax to change the meaning to whatever you want, but that's not generic, and its not specific to the algorithms being discussed. That's a specific interpretation of syntax that you happen to be using to generate a function.

Andrew Sutton wrote:
It should not. Equality should mean equality. I don't see why this should be so distasteful.
First, unnecessarily requiring both sides to be of the same type causes problems with (for instance) signed char c1 = -1; unsigned char c2 = 0xFF; where c1 != c2, but they compare equal if either is converted to the type of the other. So this requirement leads to much extra complexity with common_type and so on. Second, "test" == std::string( "test" ) == "test" but "test" may well not compare equal to "test". Third, because of the preceding two points, std::find does not, in practice, require same type or equivalence. This leads people to use it with types of the form struct X { int key_; ... value_; bool operator==( int key ) const { return key_ == key; } }; Now... this is where you legitimately may claim that these people are wrong. But I don't see why we need to prohibit this code. It makes no intuitive sense to me for std::find( first, last, value ) to differ from std::find_if( first, last, _1 == value ).

It should not. Equality should mean equality. I don't see why this should be so distasteful.
First, unnecessarily requiring both sides to be of the same type causes problems with (for instance)
signed char c1 = -1; unsigned char c2 = 0xFF;
where c1 != c2, but they compare equal if either is converted to the type of the other. So this requirement leads to much extra complexity with common_type and so on.
But I'm not requiring both sides to be the same -- only related. Also, this behavior is well defined and preserves the semantics of equality. The common type of c1 and c2 is int -- it doesn't have (and often isn't) one type or the other. int(-1) and int(255) are certainly not equal. Admittedly, you run into overflow issues with large enough integer types, but c'est la vie. There are always implementation limits that confound our best efforts to formulate accurate and meaningful semantics.
"test" == std::string( "test" ) == "test"
but "test" may well not compare equal to "test".
Do you mean to say that the character pointed to by the first "test" may not be the same as the character pointed to by the second? So: "test" == "test" might be false? That could be the case, but that's an issue with the language's resolution of an operator for ==. Comparing pointers is not the same as comparing strings. If the type of "test" was something other than "const char*", say, constant_string, you could expect different behavior.
Third, because of the preceding two points, std::find does not, in practice, require same type or equivalence.
But it does, in practice, require equality comparison, which I am suggesting should have a more precise definition.
This leads people to use it with types of the form
struct X { Â int key_; Â ... value_;
 bool operator==( int key ) const { return key_ == key; } };
Now... this is where you legitimately may claim that these people are wrong.
Generally yes, but I think I see how to generalize the relationship between X and int that would satisfy the semantics of an equivalence relation. Actually, I think I see how to do this for all types using this comparison. I'll have to think on this.

Andrew Sutton wrote:
But I'm not requiring both sides to be the same -- only related.
I'm not sure how you're doing that. I thought that this was about using the EqualityComparable standard requirement, which does require the two sides of == to be of the same type, and == to be an equivalence relation.

But I'm not requiring both sides to be the same -- only related.
I'm not sure how you're doing that. I thought that this was about using the EqualityComparable standard requirement, which does require the two sides of == to be of the same type, and == to be an equivalence relation.
The standard does not say that about EqualityComparable. 17.6.3.1 [utility.arg.requirements] only says that a == b is convertible to bool and is an equivalence relation.

Andrew Sutton wrote:
But I'm not requiring both sides to be the same -- only related.
I'm not sure how you're doing that. I thought that this was about using the EqualityComparable standard requirement, which does require the two sides of == to be of the same type, and == to be an equivalence relation.
The standard does not say that about EqualityComparable. 17.6.3.1 [utility.arg.requirements] only says that a == b is convertible to bool and is an equivalence relation.
17.6.3.1? It's in 20.1.1 [lib.equalitycomparable] in C++03 and in 20.2.1 [utility.arg.requirements] in N3225. In C++03, the preceding text states: In Table 28, T is a type to be supplied by a C + + program instantiating a template, a, b and c are values of type T. In N3225, the text states that In these tables, T is an object or reference type to be supplied by a C++ program instantiating a template; a, b, and c are values of type (possibly const) T; s and t are modifiable lvalues of type T; u denotes an identifier; rv is an rvalue of type T; and v is an lvalue of type (possibly const) T or an rvalue of type const T. It makes sense, too; the table describes what it means for T to be EqualityComparable. What would be EqualityComparable if a, b and c were arbitrary?

17.6.3.1? It's in 20.1.1 [lib.equalitycomparable] in C++03 and in 20.2.1 [utility.arg.requirements] in N3225. In C++03, the preceding text states:
In Table 28, T is a type to be supplied by a C + + program instantiating a template, a, b and c are values of type T.
Hmm.. the n3290 seems to place them in 17. I see the wording that establishes the types now.
It makes sense, too; the table describes what it means for T to be EqualityComparable. What would be EqualityComparable if a, b and c were arbitrary?
It should not be defined for arbitrary types. I've never argued that it should -- quite the opposite, in fact. What I've said is that the definition generalizes to a well defined subset of types. I look at the problem like this: if you don't assign types, then the question is, for which set of types of a and b can the expression a == b define an equivalence relation? I know that the behavior is correct for char and int or string and const char*. Why? What's special about those types that make them behave as if I was comparing objects of a single type?

On 2 November 2011 08:20, Andrew Sutton <asutton.list@gmail.com> wrote:
I look at the problem like this: if you don't assign types, then the question is, for which set of types of a and b can the expression a == b define an equivalence relation? I know that the behavior is correct for char and int or string and const char*. Why? What's special about those types that make them behave as if I was comparing objects of a single type?
They are special because we give them higher level semantics which define what that equivalence relation is and under what conditions it holds. We tend to be imprecise with (or just plain leave out) those preconditions. Take you example of string and const char*; the following asserts will either sometimes fire or invoke undefined behavior for a given std::string s, const char* p, or string literal l: assert(s == s.c_str()); assert(std::string(p) == p); assert(std::string(l, sizeof(l) - 1) == l); Now, I can easily state the preconditions, but those requirements are no longer on the types themselves but upon the values they represent. --  Nevin ":-)" Liber <mailto:nevin@eviloverlord.com> (847) 691-1404

I look at the problem like this: if you don't assign types, then the question is, for which set of types of a and b can the expression a == b define an equivalence relation? I know that the behavior is correct for char and int or string and const char*. Why? What's special about those types that make them behave as if I was comparing objects of a single type?
They are special because we give them higher level semantics which define what that equivalence relation is and under what conditions it holds.
Yes, and that mapping of semantics can be generalized in a consistent and meaningful way.
We tend to be imprecise with (or just plain leave out) those preconditions. Â Take you example of string and const char*; the following asserts will either sometimes fire or invoke undefined behavior for a given std::string s, const char* p, or string literal l:
assert(s == s.c_str()); assert(std::string(p) == p); assert(std::string(l, sizeof(l) - 1) == l);
Now, I can easily state the preconditions, but those requirements are no longer on the types themselves but upon the values they represent.
Sure, there are always limitations. Do those limitations or exceptions imply that strings aren't equal to strings, char*'s or string literals? There's a difference between preconditions and the semantics of e.g. an equivalence relation. The specification of the equivalence relation is inherently (sometimes explicitly) defined on well-formed values. Preconditions limit the set of values or expressions that are not well-formed.

On Wed, Nov 2, 2011 at 4:48 PM, Nevin Liber <nevin@eviloverlord.com> wrote:
We tend to be imprecise with (or just plain leave out) those preconditions. Â Take you example of string and const char*; the following asserts will either sometimes fire or invoke undefined behavior for a given std::string s, const char* p, or string literal l:
assert(s == s.c_str()); assert(std::string(p) == p); assert(std::string(l, sizeof(l) - 1) == l);
For what values does this fail? Isn't the operator== with std::string and const char* well defined? Olaf

On 2 November 2011 12:54, Olaf van der Spek <ml@vdspek.org> wrote:
assert(s == s.c_str()); assert(std::string(p) == p); assert(std::string(l, sizeof(l) - 1) == l);
For what values does this fail?
assert(s == s.c_str()); fails when s contains an embedded 0. assert(std::string(p) == p); invokes undefined behavior when p is 0 or p points to something that isn't 0-terminated. assert(std::string(l, sizeof(l) - 1) == l) fails when l contains an embedded 0.
Isn't the operator== with std::string and const char* well defined?
It is well defined given the precondition that p points to something that is 0-terminated. However, that precondition alone is not sufficient to make it an equivalence relation. In these case the equivalence relation is not a property of the types but of the values they represent, no matter how much handwaving goes on around here about how those cases aren't important. They can be a property of the types involved, but they just aren't in this circumstance, as const char* can mean too many different things. --  Nevin ":-)" Liber <mailto:nevin@eviloverlord.com> (847) 691-1404

In these case the equivalence relation is not a property of the types but of the values they represent, no matter how much handwaving goes on around here about how those cases aren't important. Â They can be a property of the types involved, but they just aren't in this circumstance, as const char* can mean too many different things.
From this argument, I might be led to believe that exceptional cases are viable arguments against generality. If there isn't sufficient memory to compute a + b for strings, then + must not be associative? If I can't divide by 0, then division by any number doesn't have meaning? A C-string with an embedded \0 invalidates some equality comparisons, so the rest can't be trusted to compare equal?
They're preconditions. They state where an operation is undefined for a specific set of values. They aren't contradictions.

On Wed, Nov 2, 2011 at 6:20 AM, Andrew Sutton <asutton.list@gmail.com> wrote:
17.6.3.1? It's in 20.1.1 [lib.equalitycomparable] in C++03 and in 20.2.1 [utility.arg.requirements] in N3225. In C++03, the preceding text states:
In Table 28, T is a type to be supplied by a C + + program instantiating a template, a, b and c are values of type T.
Hmm.. the n3290 seems to place them in 17. I see the wording that establishes the types now.
It makes sense, too; the table describes what it means for T to be EqualityComparable. What would be EqualityComparable if a, b and c were arbitrary?
It should not be defined for arbitrary types. I've never argued that it should -- quite the opposite, in fact. What I've said is that the definition generalizes to a well defined subset of types.
I don't understand at all the benefits of defining EqualityComparable formally in this case. What's wrong with providing an "as if" specification of the algorithm, that is, show one example of its implementation and require that alternative implementations are equivalent? As long as that works with a separately defined EqualityComparable entities, there is no need to couple the algorithm with that definition. Practically speaking, the possibility of abuse the other approach attempts to protect against isn't a big deal, as long as people know what works and what doesn't. What C++ feature isn't prone to abuse anyway? Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

on Mon Oct 31 2011, "Peter Dimov" <pdimov-AT-pdimov.com> wrote:
Dave Abrahams wrote:
I think you're missing the point, which is:
When the specification for an algorithm exposes the actual expressions used in a given implementation of an algorithm, it tends to lock down the implementation in ways that don't necessarily make sense (and can prevent optimizations). Tiny details that one usually doesn't consider, like whether part of the expression is an lvalue or an rvalue, const or non-const, become encoded into the specification of the algorithm.
None of this has anything to do with the fact that you want to require operator== to be an equivalence relation, and it doesn't have to be.
Again, missing my point. I long ago stipulated that this algorithm could be useful even in places where op== is not an equivalence relation, and that it was desirable to support those uses. My instinct to require equivalence was part of a reaction against writing the algorithm's specification in terms of valid expressions. I think such specifications need to be given scrutiny than they have been, historically. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Dave Abrahams wrote:
on Thu Oct 06 2011, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
Dave Abrahams wrote:
I'd like to see somebody write the description of the algorithm, including the concept requirements.
Well we can start by copying what the experts wrote for std::find, can't we?
Who are "the experts?" The specification that we have in the standard library is not current best practice when it comes to Generic Programming. I'm not sure who crafted the language, but it's not what we should do today.
Well perhaps we need to step back and decide what we're trying to do here. Are we trying to create stuff that is so "best practice" that it's even better than the not-yet-ratified new version of the language? Are we making that requirement an absolute one that trumps useful functionality? If so, why? It seems unnecessarily puritan to me.
For the non-predicate version it just says "Returns: The first iterator i in the range [first,last) for which *i==value holds" (n3242).
Exactly, a pile of syntax.
Back in the "clamp" discussion, I tried to argue that for trivial operations I would actually prefer to have a "pile of syntax" rather than a free-standing spec. My argument was that for trivial things we actually have the implementation in our minds first and then find ourselves trying to think of some other way of expressing the same thing in order to write down a spec; in the process, we're as likely to make a mistake in the spec as we are to make a mistake in the implementation. Examples: clamp(val,lo,hi) = val<lo ? lo : val>hi : hi : val; vs. clamp(val,lo,hi) returns the middle value when (val,lo,hi) are ordered according to operator< Regards, Phil.

on Thu Oct 06 2011, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
Dave Abrahams wrote:
on Thu Oct 06 2011, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
Dave Abrahams wrote:
I'd like to see somebody write the description of the algorithm, including the concept requirements.
Well we can start by copying what the experts wrote for std::find, can't we?
Who are "the experts?" The specification that we have in the standard library is not current best practice when it comes to Generic Programming. I'm not sure who crafted the language, but it's not what we should do today.
Well perhaps we need to step back and decide what we're trying to do here. Are we trying to create stuff that is so "best practice" that it's even better than the not-yet-ratified new version of the language?
I guess you missed the news. C++11 is ratified. But, yeah, why shouldn't we do better if we can?
Are we making that requirement an absolute one that trumps useful functionality? If so, why? It seems unnecessarily puritan to me.
Just to be completely honest here: I've been wondering the same thing... maybe I've been poisoned by Stepanovian purity.
For the non-predicate version it just says "Returns: The first iterator i in the range [first,last) for which *i==value holds" (n3242).
Exactly, a pile of syntax.
Back in the "clamp" discussion, I tried to argue that for trivial operations I would actually prefer to have a "pile of syntax" rather than a free-standing spec.
It's an interesting idea that trivial things are better described in terms of their implementations. *Very* interesting.
My argument was that for trivial things we actually have the implementation in our minds first and then find ourselves trying to think of some other way of expressing the same thing in order to write down a spec; in the process, we're as likely to make a mistake in the spec as we are to make a mistake in the implementation. Examples:
clamp(val,lo,hi) = val<lo ? lo : val>hi : hi : val;
vs.
clamp(val,lo,hi) returns the middle value when (val,lo,hi) are ordered according to operator<
Wow, that certainly is stark. When I try to describe it better in words, it reads just like your implementation. But I'm not sure what that proves. Swap is a trivial algorithm. Isn't it best described as "exchanges the values of its arguments?" -- Dave Abrahams BoostPro Computing http://www.boostpro.com

[Phil Endecott]
My argument was that for trivial things we actually have the implementation in our minds first and then find ourselves trying to think of some other way of expressing the same thing in order to write down a spec; in the process, we're as likely to make a mistake in the spec as we are to make a mistake in the implementation. Examples: clamp(val,lo,hi) = val<lo ? lo : val>hi : hi : val; vs. clamp(val,lo,hi) returns the middle value when (val,lo,hi) are ordered according to operator<
[Dave Abrahams]
Wow, that certainly is stark. When I try to describe it better in words, it reads just like your implementation. But I'm not sure what that proves.
Amusingly (to me, at least), the implementation is incorrect. It says val > hi instead of hi < val. STL

Dave Abrahams wrote:
on Thu Oct 06 2011, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
clamp(val,lo,hi) = val<lo ? lo : val>hi : hi : val;
vs.
clamp(val,lo,hi) returns the middle value when (val,lo,hi) are ordered according to operator<
Wow, that certainly is stark. When I try to describe it better in words, it reads just like your implementation. But I'm not sure what that proves. Swap is a trivial algorithm. Isn't it best described as "exchanges the values of its arguments?"
Swap is a good example. Saying "exchanges the values of its arguments" relies on the semantics of English, which is problematic; what does "exchange" mean? If it's a synonym for "swap" then we're saying "swap: it does what it says it does". But maybe "exchange" means something different; if I go into a shop and say "I bought these two shirts yesterday but they're the wrong size; may I please exchange them?", I do not expect the sales assistant to just 'swap' them and give them back to me (though if someone ever did do that, I'd give them a programming job). On the other hand, defining swap in terms of its implementation is not ideal either because of the temporary: tmp = a; a = b; b = tmp; The need to use a temporary is not inherent in the algorithm; if I'm swapping ints my processor might be able to do that with a single instruction and no (apparent) temporary. And then there are classic nasty tricks like a -= b; b += a; a = b-a; So at the very least it is necessary to say "has the same externally-observable effects as ...". Or it could be written as a postcondition: swap(a,b) postcondition: a == b' && b == a' (where I'm using ' to mean "the previous value of".) But of course that requires that == is defined. Yet surely it is perfectly acceptable to swap objects for which operator== is not defined. In the spec, I can see the "exchanges" language used in 20.2.2 and this in 17.6.3.2: the object referred to by t has the value originally held by u and [vice-versa] which just leads to the question, what does to "have the value" mean, if not operator== ? Really there is no good solution for things at this level. My view is that (in C++) we can only usefully specify algorithms that are rather more complex than these things (I suggest "sort" and above). For those more complex things we can "brush under the carpet" the nasty details at the bottom level and yet still provide some useful information to the user. For trivial algorithms, nothing is left. Regards, Phil.

Den 07-10-2011 18:11, Phil Endecott skrev:
The need to use a temporary is not inherent in the algorithm; if I'm swapping ints my processor might be able to do that with a single instruction and no (apparent) temporary. And then there are classic nasty tricks like
a -= b; b += a; a = b-a;
Or the swap-by-XORing trick. -Thorsten

on Fri Oct 07 2011, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
Dave Abrahams wrote:
on Thu Oct 06 2011, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
clamp(val,lo,hi) = val<lo ? lo : val>hi : hi : val;
vs.
clamp(val,lo,hi) returns the middle value when (val,lo,hi) are ordered according to operator<
Wow, that certainly is stark. When I try to describe it better in words, it reads just like your implementation. But I'm not sure what that proves. Swap is a trivial algorithm. Isn't it best described as "exchanges the values of its arguments?"
Swap is a good example. Saying "exchanges the values of its arguments" relies on the semantics of English, which is problematic;
Not unless you're prepared to rewrite the whole standard, it's not. The whole standard relies on the semantics of English.
what does "exchange" mean? If it's a synonym for "swap" then we're saying "swap: it does what it says it does".
You are going to penalize my description of the algorithm because it's well-named? C'mon, man. Let's just imagine the name of that algorithm was "befungle."
But maybe "exchange" means something different; if I go into a shop and say "I bought these two shirts yesterday but they're the wrong size; may I please exchange them?", I do not expect the sales assistant to just 'swap' them and give them back to me (though if someone ever did do that, I'd give them a programming job).
But that wouldn't be a reasonable interpretation of my spec, because it would be incomplete. It would leave everything in an unspecified state.
On the other hand, defining swap in terms of its implementation is not ideal either because of the temporary:
tmp = a; a = b; b = tmp;
The need to use a temporary is not inherent in the algorithm; if I'm swapping ints my processor might be able to do that with a single instruction and no (apparent) temporary. And then there are classic nasty tricks like
a -= b; b += a; a = b-a;
So at the very least it is necessary to say "has the same externally-observable effects as ...".
Or it could be written as a postcondition:
swap(a,b) postcondition: a == b' && b == a'
(where I'm using ' to mean "the previous value of".) But of course that requires that == is defined. Yet surely it is perfectly acceptable to swap objects for which operator== is not defined.
Well, no, not surely. Many people (e.g. Stepanov, Lakos,... and I guess me) think that having an operator== is essential to value semantics and that your copy constructor, assignment operator, and swap lack verfiable correctness without it.
In the spec, I can see the "exchanges" language used in 20.2.2 and this in 17.6.3.2:
the object referred to by t has the value originally held by u and [vice-versa]
which just leads to the question, what does to "have the value" mean, if not operator== ?
Exactly. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Stewart, Robert wrote:
The never ending reference to std::string vs const char* performance thing. In the "meow" example above, forcing the construction of a std::string from the string literal implies a free store allocation which, in turn, generally implies a global lock. That means it can be a drag on MT
Christian Holmquist wrote: performance. A library function shouldn't impose that if preventing or avoiding it is practicable.
Furthermore, constructing the temporary is O(N) in the length of the string, while the comparison will often only need to look at the first element. Regards, Phil.

on Wed Oct 05 2011, Marshall Clow <mclow.lists-AT-gmail.com> wrote:
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val ) { for ( ; first != last; ++first ) if ( val == *first ) return false; return true; }
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val ) { for ( ; first != last; ++first ) if ( val == *first ) return false; return true; }
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
I prefer the second one. - It's the one that matches the mental model: I don't know what the first algorithm /means/. - An exact specification for the first one is more complicated - It's more efficient in the usual case when a conversion would be needed
Of course, the (mythical?) smart compiler would do the conversion once and save the result for reuse.
I think you know the compiler would have to be able to verify that the conversion had no side-effects; not something we're likely to see for some time. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Marshall Clow-2 wrote:
So, what do people prefer (and why?):
template<typename InputIterator, typename V> bool none_of_equal ( InputIterator first, InputIterator last, V const &val ) { for ( ; first != last; ++first ) if ( val == *first ) return false; return true; }
or
template<typename InputIterator, > bool none_of_equal ( InputIterator first, InputIterator last, iterator_traits<InputIterator>::value_type const &val ) { for ( ; first != last; ++first ) if ( val == *first ) return false; return true; }
In the first case, I think there's (possible) conversion from V to iterator_traits<InputIterator>::value_type each time through the loop. In the second case, there is not.
Couldn't you provide both using SFINAE? Best, Vicente -- View this message in context: http://boost.2283326.n4.nabble.com/Boost-Algorithm-design-question-tp3876424... Sent from the Boost - Dev mailing list archive at Nabble.com.
participants (17)
-
Andrew Sutton
-
Christian Holmquist
-
Dave Abrahams
-
Emil Dotchevski
-
Eric Niebler
-
Jeffrey Lee Hellrung, Jr.
-
lcaminiti
-
Marshall Clow
-
Nathan Ridge
-
Nevin Liber
-
Olaf van der Spek
-
Peter Dimov
-
Phil Endecott
-
Stephan T. Lavavej
-
Stewart, Robert
-
Thorsten Ottosen
-
Vicente Botet