Re: [boost] [optional] operator<(optional<T>, T) -- is it wrong?

newer
[quickbook] Where to put my...

older
[spinlock] Spin on volatile read,...

Felix Uhl

26 Nov 2014 26 Nov '14

3:54 p.m.

David Stone wrote:

...

I have been bitten in the past by operator<. I was converting code that used unsigned with a 'magic value' to represent nullness to boost::optional<std::uint32_t>. However, the maximum value it used was std::numeric_limits<std::uint32_t>:max(). A simple find and replace for this magic constant (which was referred to everywhere as DWORD_NULL) with boost::none and fixing compile errors led to some erroneous code. We had code that assumed an 'uninitialized' std::uint32_t was greater than any valid value.

...

The concept of "value not present" is not inherently less-than or greater-than any value, but because of operator<, code compiled that shouldn't have. I suppose this is really an argument against any operator< overload, not just the mixed-type comparison. I don't expect this to be a major problem for new code, but there is a lot of code out there that uses a magic value, and that is what optional is supposed to replace.

I was irritated by it at first, too, and if you think about it the comparison brakes down to a philosophical problem: If I have a pair of pants and you don’t, are my pants smaller than yours? Hard to say, but to me the most logical solution is “no”. My pants aren’t smaller than yours, and my pants aren’t larger than yours. Also, your and my logic would apply to both the operator<(optional<T>, T) and operator<(optional<T>, optional<T>), but I think we’re only talking about the latter. --- Felix Uhl

Show replies by date

Gottlob Frege

26 Nov 26 Nov

10:03 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Wed, Nov 26, 2014 at 10:54 AM, Felix Uhl <felix.uhl@outlook.de> wrote:

...

I was irritated by it at first, too, and if you think about it the comparison brakes

down to a philosophical problem:

If I have a pair of pants and you don’t, are my pants smaller than yours?

Hard to say, but to me the most logical solution is “no”.

My pants aren’t smaller than yours, and my pants aren’t larger than yours.

Also, your and my logic would apply to both the operator<(optional<T>, T)

and operator<(optional<T>, optional<T>), but I think we’re only talking about

the latter.

I had the same philosophical concerns. My initial reaction was that optional shouldn't have op<, because you don't know whether I want "none" to be the bottom or the top, or whether it makes sense for my T or my use case at all. A practical example is a list of cars for sale, ordered by price - do you want to see cars with no price listed first or last? Should optional decide that? Obviously optional<T> _can_ be ordered if T is ordered, which is why specializing "std::order" - which is currently spelled "std::less" - makes sense, and is worthwhile for containers. But my initial reaction was to not like op<. Furthermore, I originally only accepted op< for the sake of containers, before realizing std::less was probably the better way to go. tl;dr: I'd support dropping op< and only having ==, !=, and std::order/std::less. Tony

Rob Stewart

10:25 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On November 26, 2014 5:03:44 PM EST, Gottlob Frege <gottlobfrege@gmail.com> wrote:

...

I'd support dropping op< and only having ==, !=, and std::order/std::less.

+1 ___ Rob (Sent from my portable computation engine)

Gavin Lambert

11:49 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 27/11/2014 11:03, Gottlob Frege wrote:

...

tl;dr: I'd support dropping op< and only having ==, !=, and std::order/std::less.

TBH, other than the "existing code" issue I'm not sure I like the idea of std::less working but op< not -- and in particular if you don't agree with the idea that "none is less than any other value" then I don't know why you'd agree with the idea that it should sort that way in a map, which is what you appear to be suggesting. Perhaps std::less should not support it either, but std::hash should, and if you want to use optional keys then you have to use std::unordered_map instead? But why do you want to have optional keys anyway? When does it make sense to have a single value attached to an empty key? It's rare for a dictionary type to permit null keys in any language. OTOH, I don't have any personal problems with op<(optional<T>,optional<T>) existing and with its current behaviour, other than the apparent disagreement about where "none" should sort (and I don't have any issues with the library author having defined this as "less", though apparently others do). I'm less convinced of the merit of op<(optional<T>,T), as the reality is that in *real life code* this is most likely to be a coding error rather than something intentional. Perhaps this is just a failure of imagination, but other than toy examples I cannot imagine a case where someone would use this intentionally. I'm not sure I'd go as far as advocating that they be explicitly removed or poisoned, but I suspect that this would improve overall code quality without unduly burdening people who do want to make this comparison. I am absolutely convinced of the merit of implicit optional<T>(T) construction and assignment, and op== working for both, and would regard it as a bug if these were removed.

Andrzej Krzemienski

27 Nov 27 Nov

1:22 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

2014-11-26 17:49 GMT-06:00 Gavin Lambert <gavinl@compacsort.com>:

...

On 27/11/2014 11:03, Gottlob Frege wrote:

...
tl;dr: I'd support dropping op< and only having ==, !=, and std::order/std::less.

TBH, other than the "existing code" issue I'm not sure I like the idea of std::less working but op< not -- and in particular if you don't agree with the idea that "none is less than any other value" then I don't know why you'd agree with the idea that it should sort that way in a map, which is what you appear to be suggesting.

I tend to agree with this reasoning op< and std::less should go in pair. In this sense I consider the case of pointers a bug. I would even go as far as to risk the following hypothesis. operator< in C++ is something different and more than comparing magnitudes in the mathematical sense: it should represent a default total order (not necessarily the natural total order". In the spirit of this reasoning, I would dare to claim that op< could be added to complex<double> with the semantics of lexicographical order.

...

Perhaps std::less should not support it either, but std::hash should, and if you want to use optional keys then you have to use std::unordered_map instead?

But why do you want to have optional keys anyway? When does it make sense to have a single value attached to an empty key? It's rare for a dictionary type to permit null keys in any language.

http://www.boost.org/doc/libs/1_57_0/libs/optional/doc/html/boost_optional/q...

...

OTOH, I don't have any personal problems with op<(optional<T>,optional<T>) existing and with its current behaviour, other than the apparent disagreement about where "none" should sort (and I don't have any issues with the library author having defined this as "less", though apparently others do).

I'm less convinced of the merit of op<(optional<T>,T), as the reality is that in *real life code* this is most likely to be a coding error rather than something intentional. Perhaps this is just a failure of imagination, but other than toy examples I cannot imagine a case where someone would use this intentionally. I'm not sure I'd go as far as advocating that they be explicitly removed or poisoned, but I suspect that this would improve overall code quality without unduly burdening people who do want to make this comparison.

Agreed. Technically, the behavior of op<(optional<T>,T) is well-defined and consistent with the conceptual model of optional. However, every practical use case I have seen so far is a programmer bug.

Vladimir Batov

8:35 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 11/28/2014 12:22 AM, Andrzej Krzemienski wrote:

...

... Technically, the behavior of op<(optional<T>,T) is well-defined and consistent with the conceptual model of optional. However, every practical use case I have seen so far is a programmer bug.

I personally have to disagree with the statement... I find your readiness to immediately promote T to optional<T> unreasonable. If you have that functionality, it does not mean you apply it it left and right and think it's the right thing to do. The behavior you actually described as "well-defined" is of op<(optional<T>, optional<T>)... the behavior of which is indeed well-defined... even though where to put "nothing" value is arguable in its own right.

Andrzej Krzemienski

8:52 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

2014-11-27 21:35 GMT+01:00 Vladimir Batov <Vladimir.Batov@constrainttec.com> :

...

On 11/28/2014 12:22 AM, Andrzej Krzemienski wrote:

...
... Technically, the behavior of op<(optional<T>,T) is well-defined and consistent with the conceptual model of optional. However, every practical use case I have seen so far is a programmer bug.

I personally have to disagree with the statement... I find your readiness to immediately promote T to optional<T> unreasonable.

It is not *my* (human's) readiness to promote T to optional<T> everywhere. This is simply a consequence of providing the converting constructor. The message sent by declaring the converting constructor (the way I understand converting ctors) is that T can be used *everywhere* where optional<T> is required. Even in the places where someone may not like that.

...

If you have that functionality, it does not mean you apply it it left and right and think it's the right thing to do.

If there was no converting constructor but only some make_optional(), I would agree. But because we have the converting constructor, "the functionality" is automatically applied everywhere. (At least this is my understanding of the purpose of converting ctors). Optional is abusing the converting syntax a bit. The conversion is loss-less, but with a slight semantic change. The behavior you actually described as "well-defined" is of

...

op<(optional<T>, optional<T>)... the behavior of which is indeed well-defined... even though where to put "nothing" value is arguable in its own right.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost

Vladimir Batov

11:12 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 11/28/2014 07:52 AM, Andrzej Krzemienski wrote:

...

2014-11-27 21:35 GMT+01:00 Vladimir Batov <Vladimir.Batov@constrainttec.com> :

...
On 11/28/2014 12:22 AM, Andrzej Krzemienski wrote:

...
... Technically, the behavior of op<(optional<T>,T) is well-defined and consistent with the conceptual model of optional. However, every practical use case I have seen so far is a programmer bug.

I personally have to disagree with the statement... I find your readiness to immediately promote T to optional<T> unreasonable.

It is not *my* (human's) readiness to promote T to optional<T> everywhere. This is simply a consequence of providing the converting constructor. The message sent by declaring the converting constructor (the way I understand converting ctors) is that T can be used *everywhere* where optional<T> is required. Even in the places where someone may not like that.

OK, it's not your "human readiness" :-) to promote T to optional<T> but it's your (as it appears to me) position that, if we allowed the implicit constructor, then it's free to be jump in anywhere "it" wants to. Given that clearly there are places where that T to optional<T> promotion is questionable, you seem to favor banning it altogether. In reality (or in human society) if 90% benefit from a particular rule, law, etc., then it's allowed. Then, difficulties and bans are created for the remaining 10% bending and mis-using the law.

...

...
If you have that functionality, it does not mean you apply it it left and right and think it's the right thing to do. If there was no converting constructor but only some make_optional(), I would agree. But because we have the converting constructor, "the functionality" is automatically applied everywhere.

... And I am only advocating to disallow "the functionality to automatically apply everywhere". Namely, where it's deemed not to be beneficial to do so... without forcing 90% suffer for the design/concept purity... but you seemingly resist that selective approach. I am sensing that's a fairly strong and seemingly non-negotiable position of yours so it's wise to simply acknowledge differences in our views. No drama.

David Stone

2:14 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Wed, Nov 26, 2014 at 4:49 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

But why do you want to have optional keys anyway? When does it make sense to have a single value attached to an empty key? It's rare for a dictionary type to permit null keys in any language.

I think this is a good point. Has anyone reading this list every actually used std::map<boost::optional<T>, U>?

Marcel Raad

4:08 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

David Stone <david <at> doublewise.net> writes:

...

Has anyone reading this list every actually used std::map<boost::optional<T>, U>?

Yes, I frequently use that.

Olaf van der Spek

4:18 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Thu, Nov 27, 2014 at 5:08 PM, Marcel Raad <raad@teamviewer.com> wrote:

...

David Stone <david <at> doublewise.net> writes:

...
Has anyone reading this list every actually used std::map<boost::optional<T>, U>?

Yes, I frequently use that.

Could you describe how you're using it? Some code perhaps? -- Olaf

Marcel Raad

10:12 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

Olaf van der Spek <ml <at> vdspek.org> writes:

...

On Thu, Nov 27, 2014 at 5:08 PM, Marcel Raad <raad <at> teamviewer.com> wrote:

...
...
Has anyone reading this list every actually used std::map<boost::optional<T>, U>?

Yes, I frequently use that.

Could you describe how you're using it?

I use it to represent "none" in statistics code similar to the example in the Boost.Optional documentation (http://www.boost.org/doc/libs/master/libs /optional/doc/html/boost_optional/quick_st art/storage_in_containers.html). And I (perhaps mis-?)use it in configuration code to represent a default value so that I don't have to have an extra variable for each map: auto it = map.find(whatever); if(it != map.end()) return *it; else return map[boost::none];

Gavin Lambert

10:48 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 28/11/2014 11:12, Marcel Raad wrote:

...

I use it to represent "none" in statistics code similar to the example in the Boost.Optional documentation (http://www.boost.org/doc/libs/master/libs /optional/doc/html/boost_optional/quick_st art/storage_in_containers.html).

And I (perhaps mis-?)use it in configuration code to represent a default value so that I don't have to have an extra variable for each map: auto it = map.find(whatever); if(it != map.end()) return *it; else return map[boost::none];

Do either of those cases actually require the keys to be sorted? I'm not sure about the former, but I doubt the latter does. If sorting is not a requirement, you could use unordered_map instead. I don't think anyone has an issue with std::hash being defined for optional types.

Olaf van der Spek

1 Dec 1 Dec

12:02 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Thu, Nov 27, 2014 at 11:12 PM, Marcel Raad <raad@teamviewer.com> wrote:

...

And I (perhaps mis-?)use it in configuration code to represent a default value so that I don't have to have an extra variable for each map:

Hmm, an extra var seems simpler then optional keys.

...

auto it = map.find(whatever); if(it != map.end()) return *it;

return i->second?

...

else return map[boost::none];

-- Olaf

Rob Stewart

27 Nov 27 Nov

4:20 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On November 27, 2014 9:14:37 AM EST, David Stone <david@doublewise.net> wrote:

...

On Wed, Nov 26, 2014 at 4:49 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...
But why do you want to have optional keys anyway? When does it make sense to have a single value attached to an empty key? It's rare for a dictionary type to permit null keys in any language.

I think this is a good point. Has anyone reading this list every actually used std::map<boost::optional<T>, U>?

I haven't, but I think the better question to ask is for legitimate use cases for that operator. Given such use cases, one can judge whether they warrant retention of the operator. If none are produced, removing the operator is far easier to justify. ___ Rob (Sent from my portable computation engine)

Gottlob Frege

4:12 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Wed, Nov 26, 2014 at 6:49 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

On 27/11/2014 11:03, Gottlob Frege wrote:

...
tl;dr: I'd support dropping op< and only having ==, !=, and std::order/std::less.

TBH, other than the "existing code" issue I'm not sure I like the idea of std::less working but op< not -- and in particular if you don't agree with the idea that "none is less than any other value" then I don't know why you'd agree with the idea that it should sort that way in a map, which is what you appear to be suggesting.

I don't like that std::less isn't the same as op< either, but it is the best we currently have. I hope to propose a fix for that - something like std::order, which map and set would use instead of std::less (and std::order would default to std::less for backwards compatibility). A good example of where std::order shouldn't be the same as op< would be std::complex. There isn't a sensible/natural mathematical way of answering whether one complex number is larger than the other (while at the same time getting the other properties we commonly expect from a less-than). So std::complex shouldn't have op<. But it would be nice to use them in maps and other algorithms. (Sure, we now have unordered_map, but map/set still has uses.) And std::complex, looked at as a pair<Number, Number>, does have a natural (lexicographical) ordering. So it would be nice if std::complex specialized std::order, but didn't have op<. If we don't get std::order, complex will probably specialize std::less in the near future, because it is what works today.

...

Perhaps std::less should not support it either, but std::hash should, and if you want to use optional keys then you have to use std::unordered_map instead?

But why do you want to have optional keys anyway? When does it make sense to have a single value attached to an empty key? It's rare for a dictionary type to permit null keys in any language.

optional<T> is Regular if T if Regular. The more that optional<T> "just works" like int works, the better. Particularly for generic code.

...

OTOH, I don't have any personal problems with op<(optional<T>,optional<T>) existing and with its current behaviour, other than the apparent disagreement about where "none" should sort (and I don't have any issues with the library author having defined this as "less", though apparently others do).

I'm less convinced of the merit of op<(optional<T>,T), as the reality is that in *real life code* this is most likely to be a coding error rather than something intentional. Perhaps this is just a failure of imagination, but other than toy examples I cannot imagine a case where someone would use this intentionally. I'm not sure I'd go as far as advocating that they be explicitly removed or poisoned, but I suspect that this would improve overall code quality without unduly burdening people who do want to make this comparison.

I am absolutely convinced of the merit of implicit optional<T>(T) construction and assignment, and op== working for both, and would regard it as a bug if these were removed.

I agree with the implicit constructor. For less-than we have these competing points, (that each of us may weigh differently) 0 - std::less should be same as op< (I think we all agree here, actually, which is why we should have std::order) 1 - op<(optional<T>, optional<T>) is "sensible" but somewhat arbitrary 2 - op<(optional<T>, T) is questionable, often (usually?) wrong 3 - in general, func(optional<T>, optional<T>) should work if passed T's (ie implicit conversion is important) 4 - op<(optional<T>, optional<T>) *is* of the form func(optional<T>, optional<T>) - ie why should it be different? 2 and 3, to me, are the strongest, but directly opposed. How do you solve that? Either say A1. "well operators are special, not typical functions" so we can poison them while keeping general functions implicit A2. 1 is on shaky ground, so let's remove both 1 and 2, but keep std::less/std::order, which was the main reason for having op<. I can live with either answers. They both sound reasonable to me. Tony

Gavin Lambert

10:44 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 28/11/2014 05:12, Gottlob Frege wrote:

...

On Wed, Nov 26, 2014 at 6:49 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...
TBH, other than the "existing code" issue I'm not sure I like the idea of std::less working but op< not -- and in particular if you don't agree with the idea that "none is less than any other value" then I don't know why you'd agree with the idea that it should sort that way in a map, which is what you appear to be suggesting.

I don't like that std::less isn't the same as op< either, but it is the best we currently have. I hope to propose a fix for that - something like std::order, which map and set would use instead of std::less (and std::order would default to std::less for backwards compatibility).

I'm not sure I understand the meaning of having an order that isn't implied by op<. If op< is omitted because comparison is meaningless or otherwise confusing, how could it be well-defined how to sort them? If there is not a single well-defined sort order, then the library should not just pick one arbitrarily, and if there is, then why would it not be appropriate as an op<?

...

A good example of where std::order shouldn't be the same as op< would be std::complex. There isn't a sensible/natural mathematical way of answering whether one complex number is larger than the other (while at the same time getting the other properties we commonly expect from a less-than).

So std::complex shouldn't have op<. But it would be nice to use them in maps and other algorithms. (Sure, we now have unordered_map, but map/set still has uses.)

And std::complex, looked at as a pair<Number, Number>, does have a natural (lexicographical) ordering.

So it would be nice if std::complex specialized std::order, but didn't have op<. If we don't get std::order, complex will probably specialize std::less in the near future, because it is what works today.

I don't agree with any of that. If you want to use it as a key for something map/set-like, use unordered_map/set. I can think of multiple possible orderings of std::complex depending on application need, and I don't think any one of them is inherently "better" than any other. So this should be left to the application to define the sort order, in the rare case that this would actually be needed.

...

...
But why do you want to have optional keys anyway? When does it make sense to have a single value attached to an empty key? It's rare for a dictionary type to permit null keys in any language.

optional<T> is Regular if T if Regular. The more that optional<T> "just works" like int works, the better. Particularly for generic code.

The first part sounds like a definition I am unfamiliar with. I agree with the second part, but I don't think it's relevant to this discussion. Nobody is talking about removing op<(optional<T>,optional<T>), which is what generic code would use. Any generic code that *could* use both T and optional<T> must by definition be aware of the fact that it's using optional, so will either know that op<(optional<T>,T) is not sensible and avoid using it or will explicitly convert as needed so that it invokes op<(optional<T>,optional<T>) instead. Alternatively it must be aware that it is using different types, so op< may not be sane between them. Either way I don't think this would be a problem in real code. However possibly there might need to be a type trait specialization that explicitly denies op<(optional<T>,T) and the reverse, to avoid confusing code that attempts to detect when op< does work between distinct types.

...

For less-than we have these competing points, (that each of us may weigh differently)

0 - std::less should be same as op< (I think we all agree here, actually, which is why we should have std::order)

1 - op<(optional<T>, optional<T>) is "sensible" but somewhat arbitrary 2 - op<(optional<T>, T) is questionable, often (usually?) wrong 3 - in general, func(optional<T>, optional<T>) should work if passed T's (ie implicit conversion is important) 4 - op<(optional<T>, optional<T>) *is* of the form func(optional<T>, optional<T>) - ie why should it be different?

All true.

...

2 and 3, to me, are the strongest, but directly opposed. How do you solve that? Either say

A1. "well operators are special, not typical functions" so we can poison them while keeping general functions implicit A2. 1 is on shaky ground, so let's remove both 1 and 2, but keep std::less/std::order, which was the main reason for having op<.

Of these, I prefer A1. I think it really is a special case. But I can understand why the present behaviour is as it is. I suspect those who dislike "none" having a defined order would go for A2 though. I would be ok with that as well, but I don't have a problem with this definition. But you're leaving out A3: remove 1+2 AND std::less/std::order. This means that optional types can't be used in map keys, only in unordered_map. I actually think that this is the best option, except that it has a greater potential to break existing code. Perhaps a predicate could be defined specific to optional giving the current ordering, so that users who do want that ordering or to continue using a map could just specify that predicate as a minimal fix. But I don't think that the predicate should be spelled "std::less". (When I define custom ordering for reference types in C#, which are nullable, I always specify that 'null' is smaller than any non-null value as well. The library Nullable type behaves similarly, although as I noted in another thread it does *not* support op< locally, you have to explicitly call a static comparison method to compare two Nullable values, or a Nullable and an implicitly promoted base value. This aligns best with A3+predicate.)

Vicente J. Botet Escriba

28 Nov 28 Nov

1:08 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

...

On 28/11/2014 05:12, Gottlob Frege wrote:

...
On Wed, Nov 26, 2014 at 6:49 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...
TBH, other than the "existing code" issue I'm not sure I like the idea of std::less working but op< not -- and in particular if you don't agree with the idea that "none is less than any other value" then I don't know why you'd agree with the idea that it should sort that way in a map, which is what you appear to be suggesting.

I don't like that std::less isn't the same as op< either, but it is the best we currently have. I hope to propose a fix for that - something like std::order, which map and set would use instead of std::less (and std::order would default to std::less for backwards compatibility).

I'm not sure I understand the meaning of having an order that isn't implied by op<. If op< is omitted because comparison is meaningless or otherwise confusing, how could it be well-defined how to sort them? If there is not a single well-defined sort order, then the library should not just pick one arbitrarily, and if there is, then why would it not be appropriate as an op<? There are a lot of kinds of orders (http://en.wikipedia.org/wiki/Total_order). Some of us would like that T < T mean the same kind of order thing independently of T. As we have already int < int that represents the natural order it follows that T < T should mean then the natural order. We can then have a lot of other orders that don't use this specific syntax. The order needed by the ordered containers is not the natural order but a strict week order ( http://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings ).

...
A good example of where std::order shouldn't be the same as op< would be std::complex. There isn't a sensible/natural mathematical way of answering whether one complex number is larger than the other (while at the same time getting the other properties we commonly expect from a less-than).

So std::complex shouldn't have op<. But it would be nice to use them in maps and other algorithms. (Sure, we now have unordered_map, but map/set still has uses.)

And std::complex, looked at as a pair<Number, Number>, does have a natural (lexicographical) ordering.

So it would be nice if std::complex specialized std::order, but didn't have op<. If we don't get std::order, complex will probably specialize std::less in the near future, because it is what works today.

I don't agree with any of that. If you want to use it as a key for something map/set-like, use unordered_map/set. I can think of multiple possible orderings of std::complex depending on application need, and I don't think any one of them is inherently "better" than any other. So this should be left to the application to define the sort order, in the rare case that this would actually be needed. I think you are agreeing here, isn't it?

...
...
But why do you want to have optional keys anyway? When does it make sense to have a single value attached to an empty key? It's rare for a dictionary type to permit null keys in any language.

optional<T> is Regular if T if Regular. The more that optional<T> "just works" like int works, the better. Particularly for generic code.

The first part sounds like a definition I am unfamiliar with. I agree with the second part, but I don't think it's relevant to this discussion. Nobody is talking about removing op<(optional<T>,optional<T>), which is what generic code would use.

Le 27/11/14 23:44, Gavin Lambert a écrit : optional<int> is the sum of none and T. If T has a natural order, the natural order of the sum can be defined. So optional<T> should define operator < only if T defines it.

...

Any generic code that *could* use both T and optional<T> must by definition be aware of the fact that it's using optional, so will either know that op<(optional<T>,T) is not sensible and avoid using it or will explicitly convert as needed so that it invokes op<(optional<T>,optional<T>) instead. The problem is not the operator w, but the implicit conversions. Alternatively it must be aware that it is using different types, so op< may not be sane between them. Either way I don't think this would be a problem in real code.

However possibly there might need to be a type trait specialization that explicitly denies op<(optional<T>,T) and the reverse, to avoid confusing code that attempts to detect when op< does work between distinct types. I don't think this follow "make simple things simple".

...
For less-than we have these competing points, (that each of us may weigh differently)

0 - std::less should be same as op< (I think we all agree here, actually, which is why we should have std::order) Only if op< id defined.

1 - op<(optional<T>, optional<T>) is "sensible" but somewhat arbitrary I don't agree here. We don't need anything arbitrary. op<optional<T>,optional<T>) must be defined only if op<(T,T> is defined. 2 - op<(optional<T>, T) is questionable, often (usually?) wrong Agreed. 3 - in general, func(optional<T>, optional<T>) should work if passed T's (ie implicit conversion is important) I have my doubts here. Implicit conversions provide often (usually?) unexpected behavior. 4 - op<(optional<T>, optional<T>) *is* of the form func(optional<T>, optional<T>) - ie why should it be different? Nothing to say here ;-)

All true.

...
2 and 3, to me, are the strongest, but directly opposed. How do you solve that? Either say

A1. "well operators are special, not typical functions" so we can poison them while keeping general functions implicit A2. 1 is on shaky ground, so let's remove both 1 and 2, but keep std::less/std::order, which was the main reason for having op<.

Of these, I prefer A1. I think it really is a special case. But I can understand why the present behaviour is as it is.

I suspect those who dislike "none" having a defined order would go for A2 though. I would be ok with that as well, but I don't have a problem with this definition.

But you're leaving out A3: remove 1+2 AND std::less/std::order. This means that optional types can't be used in map keys, only in unordered_map. I actually think that this is the best option, except that it has a greater potential to break existing code. Perhaps a predicate could be defined specific to optional giving the current ordering, so that users who do want that ordering or to continue using a map could just specify that predicate as a minimal fix. But I don't think that the predicate should be spelled "std::less".

Currently we have std::less, not std::order and the STL ordered containers are using as default comparator std::less. So let define std::less<optional<T>> using std::less<T>. Vicente P.S. For some mathematical fundation on sum types see http://en.wikipedia.org/wiki/Tagged_union For some examples of optional type on other languages, see http://en.wikipedia.org/wiki/Option_type

Gavin Lambert

5:15 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 28/11/2014 14:08, Vicente J. Botet Escriba wrote:

...

Le 27/11/14 23:44, Gavin Lambert a écrit :

...
I'm not sure I understand the meaning of having an order that isn't implied by op<. If op< is omitted because comparison is meaningless or otherwise confusing, how could it be well-defined how to sort them? If there is not a single well-defined sort order, then the library should not just pick one arbitrarily, and if there is, then why would it not be appropriate as an op<?

There are a lot of kinds of orders (http://en.wikipedia.org/wiki/Total_order). Some of us would like that T < T mean the same kind of order thing independently of T. As we have already int < int that represents the natural order it follows that T < T should mean then the natural order. We can then have a lot of other orders that don't use this specific syntax. The order needed by the ordered containers is not the natural order but a strict week order ( http://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings ).

That doesn't address what I was referring to. My point was that if there is disagreement on where "none" should sort, or how "complex" should sort, then the library should not define these as default sorts (via op< or std::less). Instead it should either provide standard sorting predicates for the common options or just leave it up to the application to decide how it wants to sort things.

...

...
...
So it would be nice if std::complex specialized std::order, but didn't have op<. If we don't get std::order, complex will probably specialize std::less in the near future, because it is what works today.

I don't agree with any of that. If you want to use it as a key for something map/set-like, use unordered_map/set. I can think of multiple possible orderings of std::complex depending on application need, and I don't think any one of them is inherently "better" than any other. So this should be left to the application to define the sort order, in the rare case that this would actually be needed.

I think you are agreeing here, isn't it?

I am not agreeing that optional or complex should specialise either of std::less or std::order. I am ok with both defining *type-specific* sorting predicates, which the application code can choose one of to link to the generic predicate if it wishes to, or just use explicitly, or define its own.

...

...
...
optional<T> is Regular if T if Regular. The more that optional<T> "just works" like int works, the better. Particularly for generic code.

The first part sounds like a definition I am unfamiliar with. I agree with the second part, but I don't think it's relevant to this discussion. Nobody is talking about removing op<(optional<T>,optional<T>), which is what generic code would use.

optional<int> is the sum of none and T. If T has a natural order, the natural order of the sum can be defined. So optional<T> should define operator < only if T defines it.

Yes, but that does not imply the reverse. If T does not define op< then optional<T> must not, but if T does define op< then this merely means that optional<T> *could*, but not whether it *should*. Another example of a sum type is a variant (tagged union). I think it's fair to say that using op< on variants with different internal types (eg. variant holding an int vs. variant holding a string) is not a reasonable comparison, despite both internal types having a natural order. (You can try artificially defining one such as lexicographical order if converted to string, but now you just have two problems.) The same should apply to optional<T> -- trying to order a value with the subtype "none" vs. one with the subtype "T" should not be meaningful. Which kind of sounds like I'm coming around to the side that says op<(optional<T>,optional<T>) shouldn't exist either.

...

...
Any generic code that *could* use both T and optional<T> must by definition be aware of the fact that it's using optional, so will either know that op<(optional<T>,T) is not sensible and avoid using it or will explicitly convert as needed so that it invokes op<(optional<T>,optional<T>) instead. The problem is not the operator w, but the implicit conversions.

Of course, in absence of a poisoning method, if op<(optional<T>, optional<T>) and an implicit constructor both exist, then op<(optional<T>,T) will automatically exist even if not explicitly defined. Since the discussion related to explicit poisoning that didn't seem worth mentioning.

...

...
However possibly there might need to be a type trait specialization that explicitly denies op<(optional<T>,T) and the reverse, to avoid confusing code that attempts to detect when op< does work between distinct types. I don't think this follow "make simple things simple".

I'm coming to agree with that, which suggests op< should simply not exist at all.

...

...
But you're leaving out A3: remove 1+2 AND std::less/std::order. This means that optional types can't be used in map keys, only in unordered_map. I actually think that this is the best option, except that it has a greater potential to break existing code. Perhaps a predicate could be defined specific to optional giving the current ordering, so that users who do want that ordering or to continue using a map could just specify that predicate as a minimal fix. But I don't think that the predicate should be spelled "std::less".

Currently we have std::less, not std::order and the STL ordered containers are using as default comparator std::less. So let define std::less<optional<T>> using std::less<T>.

I realise that this is the current design. But we were discussing an ideal "alternate" design. Defining op< for containers was a necessary evil while std::map was the only standard associative container, and it happened to require ordered keys. Now that unsorted associative containers exist (std::unordered_map and its Boost equivalent for pre-C++11), I don't think its existence can be justified any more.

Matt Calabrese

8:37 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Thu, Nov 27, 2014 at 9:15 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

On 28/11/2014 14:08, Vicente J. Botet Escriba wrote:

...
Le 27/11/14 23:44, Gavin Lambert a écrit :

...
I'm not sure I understand the meaning of having an order that isn't implied by op<. If op< is omitted because comparison is meaningless or otherwise confusing, how could it be well-defined how to sort them? If there is not a single well-defined sort order, then the library should not just pick one arbitrarily, and if there is, then why would it not be appropriate as an op<?

There are a lot of kinds of orders (http://en.wikipedia.org/wiki/Total_order). Some of us would like that T < T mean the same kind of order thing independently of T. As we have already int < int that represents the natural order it follows that T < T should mean then the natural order. We can then have a lot of other orders that don't use this specific syntax. The order needed by the ordered containers is not the natural order but a strict week order ( http://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings ).

That doesn't address what I was referring to. My point was that if there is disagreement on where "none" should sort, or how "complex" should sort, then the library should not define these as default sorts (via op< or std::less). Instead it should either provide standard sorting predicates for the common options or just leave it up to the application to decide how it wants to sort things.

Why? I've heard this stated before regarding ordering and frankly I think it's nonsense. Whenever you design a library you're going to make choices that don't have an objective correct or incorrect option, even separate from the issue of ordering. This can be anywhere from simple things like naming or parameter order to the higher-level semantics of your types. The library developer frequently makes subjective decisions and that's perfectly fine -- I'd even say it's a good thing. The fact that there are multiple valid solutions does not at all imply that you should avoid choosing any of them, especially in this case where we are talking about a default that doesn't even have to be used. If, in the case of a default ordering, that default isn't what's desired for a specific situation, then the user can just be explicit there. Do you have a problem with the fact that tuples have a default ordering? What about strings? These are not problematic in practice, nor, I'd argue, are they problematic in theory. Just to make things a little more grounded with respect to optional and variant, consider what happens when you provide a default ordering: First, there are many cases where someone just wants a default ordering, regardless of what that ordering may be (I.E. to use the type in a set), as long as the default isn't doing something like creating unnecessary equivalence classes. In this case, it doesn't matter that the library designer chose the default. Any of these orderings are acceptable for the user. Alright, so what if, in a particular case, the default ordering isn't what a user personally wants? In that case, the programmer would just use an explicit ordering when using I.E. set. Note that this case is no different from what the user would have to do if no default ordering were provided. Finally, what if someone sees the default ordering used in code? Because it might not be immediately clear to the user, they'd simply look up what the library specifies, assuming they need to or want to know. I don't see how any of these situations are problematic. Okay, so what happens if you reverse the situation and don't provide a default ordering: First, people can no longer use the type in datastructures like sets unless they provide an explicit ordering, even if they don't care what the ordering is. So here the user needs to do more to accomplish the same end result. What exactly does the user gain? Is it just that if someone else sees a default ordering used, they might have to investigate what that default ordering is assuming they need to rely on it? If so, how is this at all different from the user seeing any function used that they don't immediately know the semantics of? In either case, all they'd do is look it up. Just because the function happens to be called std::order/std::less/operator< shouldn't make a difference. As well, when you specify a default ordering, users can often take advantage of it in a way that makes it applicable to their situation and they retain the benefits of there being default. As an example of this, on one occasion I used a variant to store the rank of a 5-card poker hand. In other words I had the following: variant<high_card, pair, two_pair, three_of_a_kind, straight, flush, full_house, four_of_a_kind, straight_flush> rank; In case it's not clear, each of those types has their own state, since, for instance, not all three_of_a_kind are equal (the same is true for all of the other types as well). Effectively, to order them appropriately, they are ranked by their type first, and then by a homogeneous comparison if they are the same type. Because the variant template I used had specified that the default ordering implies earlier-parameters are less than later-parameters, the default behavior does exactly what I want. In fact, that's exactly why I specified the parameters in that manner. I was able to take advantage of the default because it was specified with the library. Okay, so what if the library were designed in a way that used the reverse order (after all, either choice is a valid ordering)? Is this "wrong?" No. In that case, all I would have done was reversed the order of the arguments that I used with the variant so that the default order made sense for my use-case. In either situation I get what I want with very minimal work. Obviously there are other, more contrived orderings, but these are the two obvious ones. The point is, yeah, there is no "correct" choice for the default ordering, but that's true of most decisions that are made when designing a library. That in no way implies that the library should not make any choice at all. Now, flipping things around, what would happen if the library designer said "I don't want to choose a default"? As a user of the library, the only effect is that I now need to do more work to accomplish my same goal for any place that can take advantage of a default ordering. What exactly is the tangible benefit here? Of course, in absence of a poisoning method, if op<(optional<T>,

...

optional<T>) and an implicit constructor both exist, then op<(optional<T>,T) will automatically exist even if not explicitly defined. Since the discussion related to explicit poisoning that didn't seem worth mentioning.

That's not true. Assuming that you are defining the operator as a template, and not, for instance, as an inline friend of the class template, then the implicit conversion there will not take place unless the template argument is explicitly provided.

...

Defining op< for containers was a necessary evil while std::map was the only standard associative container, and it happened to require ordered keys. Now that unsorted associative containers exist (std::unordered_map and its Boost equivalent for pre-C++11), I don't think its existence can be justified any more.

WHAT!? The choice of whether to use an ordered or unordered associative container should have nothing to do with this. Period. They are entirely different data structures with different complexities and the issue of defaults is entirely orthogonal. However, while we're on the subject, I think it's really important to point out that the unordered containers are also taking advantage of a default that by your, and others', logic, shouldn't exist -- the default hash. Just like with ordering, there is any number of equally valid hashes that can be associated with a given type. Why is one better than any other? I don't see why anyone would say that it doesn't make sense to have a default ordering but yet it does make sense to have, and rely on, a default hash. -- -Matt Calabrese

Gavin Lambert

30 Nov 30 Nov

11:22 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 28/11/2014 21:37, Matt Calabrese wrote:

...

Why? I've heard this stated before regarding ordering and frankly I think it's nonsense. Whenever you design a library you're going to make choices that don't have an objective correct or incorrect option, even separate from the issue of ordering. This can be anywhere from simple things like naming or parameter order to the higher-level semantics of your types. The library developer frequently makes subjective decisions and that's perfectly fine -- I'd even say it's a good thing. The fact that there are multiple valid solutions does not at all imply that you should avoid choosing any of them, especially in this case where we are talking about a default that doesn't even have to be used. If, in the case of a default ordering, that default isn't what's desired for a specific situation, then the user can just be explicit there. Do you have a problem with the fact that tuples have a default ordering? What about strings? These are not problematic in practice, nor, I'd argue, are they problematic in theory.

Strings are ok -- ordinal ordering is a reasonable default, though admittedly sometimes other orders need to be selected by the application. But this is fine as is. If tuples have a default ordering (I'm not sure, because I haven't used them), then yes, I would regard that as a bug.

...

Okay, so what happens if you reverse the situation and don't provide a default ordering: First, people can no longer use the type in datastructures like sets unless they provide an explicit ordering, even if they don't care what the ordering is. So here the user needs to do more to accomplish the same end result. What exactly does the user gain?

This is because they are using the wrong data structure. It *should* fail because there is no sensible default ordering for tuples or optionals or variants. In which case they should take the hint that they should be using an unordered container instead, such as unordered_map or unordered_set.

...

Now, flipping things around, what would happen if the library designer said "I don't want to choose a default"? As a user of the library, the only effect is that I now need to do more work to accomplish my same goal for any place that can take advantage of a default ordering. What exactly is the tangible benefit here?

For one thing, it makes your code immune to a change in the default ordering as a result of a library upgrade or switching to a different type (eg. when switching from boost to std, if these made different choices). For another, it makes you actually think about what you're doing, and whether you actually *want* ordering.

...

...
Defining op< for containers was a necessary evil while std::map was the only standard associative container, and it happened to require ordered keys. Now that unsorted associative containers exist (std::unordered_map and its Boost equivalent for pre-C++11), I don't think its existence can be justified any more.

WHAT!? The choice of whether to use an ordered or unordered associative container should have nothing to do with this. Period. They are entirely different data structures with different complexities and the issue of defaults is entirely orthogonal. However, while we're on the subject, I think it's really important to point out that the unordered containers are also taking advantage of a default that by your, and others', logic, shouldn't exist -- the default hash. Just like with ordering, there is any number of equally valid hashes that can be associated with a given type. Why is one better than any other? I don't see why anyone would say that it doesn't make sense to have a default ordering but yet it does make sense to have, and rely on, a default hash.

Hashes are an equality relation, not an ordering relation. It's a completely different thing. I have no problem with (and in fact strongly advocate) having equality work for tuples, optionals, and variants. And nowhere did I say that the application author *couldn't* use an ordered container if they wish. I'm just saying that those particular types, as key types, should *not* have a default order relation (as there is no meaningful definition of one -- anything is entirely arbitrary), and so while authors should *prefer* to use unordered containers, if desired they can use an ordered container by defining the desired ordering explicitly themselves. (Which might mean writing an ordering function themselves or it might mean simply specifying which ordering function provided by the library should be used. But that function shouldn't be "std::less".)

Matt Calabrese

1 Dec 1 Dec

12:58 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Sun, Nov 30, 2014 at 3:22 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

...
Strings are ok -- ordinal ordering is a reasonable default, though admittedly sometimes other orders need to be selected by the application. But this is fine as is.

Why? What makes the string default ordering "reasonable" and complex or variant not? There are a lot of ways to order strings that might even be more sensible in more cases than the current default (I.E. having different cases be adjacent in the ordering, such as "a" and "A"). It's only a default and you can't expect that default to apply to all situations. The same goes for complex or variant or any other type. The existence of multiple orderings should not imply that a default ordering should be left out. Quite literally all types that have more than one value have multiple orderings. This is fine and does not conflict with there being a default. It's perfectly acceptable to need an ordering and yet not know or care what that ordering is. If tuples have a default ordering (I'm not sure, because I haven't used

...

them), then yes, I would regard that as a bug.

They do have a default ordering, and while I agree that there is a mistake in the internals of the tuple definition of its order, I think that a default ordering existing for them is, in fact, a good thing (the problem I have with tuple ordering is that it defines all of the operators around < rather than the corresponding operators of the underlying types, and it also does not specialize std::less and family IIRC. Tony's suggestion for optional does things correctly and is what tuple should have done.). Anyway, why specifically do you consider a tuple having a default ordering as a bug? All it is is a lexicographical order, just like std::string and just like the other containers of the standard library. It just happens to be heterogenous and of a fixed size. Why do you consider it not "reasonable" to have a default ordering as a lexicographical ordering for tuples and yet consider it "reasonable" to have a default ordering as a lexicographical ordering for std::string? The distinction that is being made seems entirely arbitrary unless you or someone else can (1) specify unambiguously what that supposed difference actually is including how you would generalize that distinction, and (2) explain why such a difference is actually important with respect to defining a default ordering via std::less (or a proposed std::order). I don't buy the distinction being worthwhile either in practice or in theory. As far as I can tell it is arbitrary, inconsistent, and without any objective rationale. Okay, so what happens if you reverse the situation and don't provide a

...

...
default ordering: First, people can no longer use the type in datastructures like sets unless they provide an explicit ordering, even if they don't care what the ordering is. So here the user needs to do more to accomplish the same end result. What exactly does the user gain?

This is because they are using the wrong data structure. It *should* fail because there is no sensible default ordering for tuples or optionals or variants. In which case they should take the hint that they should be using an unordered container instead, such as unordered_map or unordered_set.

I'm sorry, but that is simply bad programming advice. Putting a tuple in a set is not using the "wrong" datastructure. A tuple can be correctly ordered just like any other type and the same is true of optionals or variants. There not being a single default order that everyone would expect without referencing documentation is completely separate from the choice of whether a programmer should use a set or an unordered set. Claiming that this should be a reason to choose one datastructure over another is horrible.

...

Now, flipping things around, what would happen if the library designer said

...
"I don't want to choose a default"? As a user of the library, the only effect is that I now need to do more work to accomplish my same goal for any place that can take advantage of a default ordering. What exactly is the tangible benefit here?

For one thing, it makes your code immune to a change in the default ordering as a result of a library upgrade or switching to a different type (eg. when switching from boost to std, if these made different choices).

Let's be clear. By having a default I'm not saying that it's up to the vendor to choose a default that can change between implementations, I'm saying that it should be specified what that default is. If there is some breaking change between standards, then it's no different from any other potentially breaking change. This is what we already do for the types with ordering in the standard (I.E. the containers). For another, it makes you actually think about what you're doing, and

...

whether you actually *want* ordering.

What's with the "want" here? Why would you not want an order? I can see not needing it for a certain application and not minding if it's there, but you make it sound like it's somehow a bad thing for it to even exist. If you personally don't need it then you don't have to use it. I'm just saying that those particular types, as key types, should *not*

...

have a default order relation (as there is no meaningful definition of one -- anything is entirely arbitrary), and so while authors should *prefer* to use unordered containers, if desired they can use an ordered container by defining the desired ordering explicitly themselves.

No. Just no. Please stop saying that people should "prefer" unordered containers simply because there's no default ordering that everyone would agree on. This reasoning is baseless and is a horrible recommendation for when to use one kind of associative container over another. There being or not being a default should not influence such a decision. If it does, then IMO we've failed in the design. -- -Matt Calabrese

Olaf van der Spek

1:19 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 1:58 AM, Matt Calabrese <rivorus@gmail.com> wrote:

...

Why? What makes the string default ordering "reasonable" and complex or variant not? There are a lot of ways to order strings that might even be more sensible in more cases than the current default (I.E. having different

I think this kind of reasoning isn't very fruitful. Optional is not a string, string is not an optional. One having operator< does not imply the other should have it to.

Matt Calabrese

1:31 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Sun, Nov 30, 2014 at 5:19 PM, Olaf van der Spek <ml@vdspek.org> wrote:

...

On Mon, Dec 1, 2014 at 1:58 AM, Matt Calabrese <rivorus@gmail.com> wrote:

...
Why? What makes the string default ordering "reasonable" and complex or variant not? There are a lot of ways to order strings that might even be more sensible in more cases than the current default (I.E. having different

I think this kind of reasoning isn't very fruitful. Optional is not a string, string is not an optional. One having operator< does not imply the other should have it to.

Hmm? What isn't reasonable is arbitrarily deciding what should and should not have a default ordering (not talking about operator< since that's a much more controversial matter). The notion of default ordering is useful because many datastructures and algorithms can make use of it, particularly those in the standard. There is absolutely nothing wrong with ordering tuples or optionals or variants, or any type at all. The only thing you accomplish by not having a default ordering is that you arbitrarily make the type difficult to use with these datastructures and algorithms. The recommendation of "prefer unordered set" is already an example of the type of poor programming advice that comes from this as a side effect. -- -Matt Calabrese

Olaf van der Spek

11:22 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 2:31 AM, Matt Calabrese <rivorus@gmail.com> wrote:

...

...
I think this kind of reasoning isn't very fruitful. Optional is not a string, string is not an optional. One having operator< does not imply the other should have it to.

Hmm? What isn't reasonable is arbitrarily deciding what should and should not have a default ordering (not talking about operator< since that's a

True, but it's NOT an arbitrary decision, is it?

...

much more controversial matter). The notion of default ordering is useful because many datastructures and algorithms can make use of it, particularly

What std datastructures use it?

...

those in the standard. There is absolutely nothing wrong with ordering tuples or optionals or variants, or any type at all.

That's what you keep saying but you don't back it up.

...

The only thing you accomplish by not having a default ordering is that you arbitrarily make the type difficult to use with these datastructures and algorithms. The

Having to explicitly pass in a comparator is not difficult is it?

...

recommendation of "prefer unordered set" is already an example of the type of poor programming advice that comes from this as a side effect.

-- Olaf

Matt Calabrese

4:29 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 3:22 AM, Olaf van der Spek <ml@vdspek.org> wrote:

...

On Mon, Dec 1, 2014 at 2:31 AM, Matt Calabrese <rivorus@gmail.com> wrote:

...
...
I think this kind of reasoning isn't very fruitful. Optional is not a string, string is not an optional. One having operator< does not imply the other should have it to.

Hmm? What isn't reasonable is arbitrarily deciding what should and should not have a default ordering (not talking about operator< since that's a

True, but it's NOT an arbitrary decision, is it?

It is arbitrary until and unless people actually explain what they feel is the difference between cases that they consider "reasonable" and "unreasonable." This is particularly true when one container that uses a lexicographical compare is considered "reasonable" while another container that uses a lexicographical compare is considered "unreasonable." Why you want a default order is generally analogous in both situations and denying one while not the other without any objective rationale is a very weak decision. All types can be ordered and all types that have more than one value can be ordered in multiple ways. No particular ordering is is applicable in all situations for any type -- even integers. That does not mean that a default is unimportant since frequently you require ordering for a datastructure or algorithm to work, but the actual ordering may be unimportant to the user or to the implementation. This is the whole point, which I mentioned in every single reply.

...

...
much more controversial matter). The notion of default ordering is useful because many datastructures and algorithms can make use of it, particularly

What std datastructures use it?

The ones that we've been talking about for the entire thread in almost every reply. I'm not going to repeat again.

...

those in the standard. There is absolutely nothing wrong with ordering

...
tuples or optionals or variants, or any type at all.

That's what you keep saying but you don't back it up.

Again, I have done so multiple times in this thread, both with concrete examples and with reference to why it is important in more generic code. You can choose to put your fingers in your ears if you want, but that changes nothing. Frankly, it's sort of unfortunate that the usefulness of ordering for /any/ type needs to be re-explained in the context of C++, since it's such a fundamental concept that has been talked about many times before, not just by myself in this thread but by key C++ people, i.e. Stepanov. Especially when you are creating a generic composite type that you expect people to use, you should be conscious of ordering.

...

The only thing you

...
accomplish by not having a default ordering is that you arbitrarily make the type difficult to use with these datastructures and algorithms. The

Having to explicitly pass in a comparator is not difficult is it?

As explained, it is both needless and can also be an arbitrary decision even in non-generic code, let alone in generic code where all you usually care about is that an ordering exists rather than what that particular ordering is. Ideally you always forward the comparator along when writing generic code that needs ordering, using a default of std::less just as the standard does for its generic containers, and yet the default is still useful. In fact, default comparators are so useful in the standard library that plenty of C++ programmers don't even know that the comparators are template arguments at all, yet they get along fine. I doubt that explicitly passing the comparator is "not difficult" to them. Regardless, stating that explicitly passing an argument where a default can be used is "not difficult" is a cop-out since you can always say the same thing regarding any default in any context. In this particular context, that hypothetical lack of a default comparator for the types in question has already led to the recommendation of preferring entirely different associative datastructures over passing in an explicit comparator in this very thread (ironically, the other containers that were recommended also depend on a subjective default, only it's the default hashing function instead of the default ordering function). You should not be afraid to have your types be ordered when they can be ordered. Most decisions in development have a subjective aspect to them and precisely which default ordering comparator you provide is just one of those many decisions. The fact that a default is subjective (and it always is, even for arithmetic types), does not mean that a default is not useful. Why do we use less as a default for arithmetic types with respect to a set? Wouldn't greater be just as sensible? What about the other orderings? We only accept that it is "reasonable" now to have less be the default ordering because that was the decision that was made decades ago and people are used to it. It. Is. Just. A. Default. -- -Matt Calabrese

Olaf van der Spek

6:58 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 5:29 PM, Matt Calabrese <rivorus@gmail.com> wrote:

...

It is arbitrary until and unless people actually explain what they feel is the difference between cases that they consider "reasonable" and "unreasonable." This is particularly true when one container that uses a lexicographical compare is considered "reasonable" while another container that uses a lexicographical compare is considered "unreasonable." Why you want a default order is generally analogous in both situations and denying one while not the other without any objective rationale is a very weak decision. All types can be ordered and all types that have more than one value can be ordered in multiple ways. No particular ordering is is applicable in all situations for any type -- even integers. That does not mean that a default is unimportant since frequently you require ordering for a datastructure or algorithm to work, but the actual ordering may be unimportant to the user or to the implementation. This is the whole point, which I mentioned in every single reply.

IMO integers have a natural ordering, optionals don't. In most cases the natural ordering for integers is fine, whether the proposed ordering for optionals is fine isn't clear.

...

...
...
much more controversial matter). The notion of default ordering is useful because many datastructures and algorithms can make use of it, particularly

What std datastructures use it?

The ones that we've been talking about for the entire thread in almost every reply. I'm not going to repeat again.

vector? Doesn't use std:less for operator< does it? map/set? Sure, maybe, most of the time

...

Again, I have done so multiple times in this thread, both with concrete examples and with reference to why it is important in more generic code. You can choose to put your fingers in your ears if you want, but that changes nothing. Frankly, it's sort of unfortunate that the usefulness of ordering for /any/ type needs to be re-explained in the context of C++, since it's such a fundamental concept that has been talked about many times before, not just by myself in this thread but by key C++ people, i.e. Stepanov. Especially when you are creating a generic composite type that you expect people to use, you should be conscious of ordering.

Lots of types survive just fine without (default) ordering.

...

...
The only thing you

...
accomplish by not having a default ordering is that you arbitrarily make the type difficult to use with these datastructures and algorithms. The

Having to explicitly pass in a comparator is not difficult is it?

As explained, it is both needless and can also be an arbitrary decision even in non-generic code, let alone in generic code where all you usually care about is that an ordering exists rather than what that particular ordering is. Ideally you always forward the comparator along when writing generic code that needs ordering, using a default of std::less just as the standard does for its generic containers, and yet the default is still useful. In fact, default comparators are so useful in the standard library that plenty of C++ programmers don't even know that the comparators are template arguments at all, yet they get along fine. I doubt that explicitly passing the comparator is "not difficult" to them. Regardless, stating that explicitly passing an argument where a default can be used is "not difficult" is a cop-out since you can always say the same thing regarding any default in any context. In this particular context, that hypothetical

I think I'm still thinking about operator< Other then map, do you have a reference for all these algorithms etc that would benefit from less<optional<>>?

...

lack of a default comparator for the types in question has already led to the recommendation of preferring entirely different associative datastructures over passing in an explicit comparator in this very thread (ironically, the other containers that were recommended also depend on a subjective default, only it's the default hashing function instead of the default ordering function).

How is the hash function subjective w.r.t. semantics?

...

You should not be afraid to have your types be ordered when they can be ordered. Most decisions in development have a subjective aspect to them and precisely which default ordering comparator you provide is just one of those many decisions. The fact that a default is subjective (and it always is, even for arithmetic types), does not mean that a default is not useful. Why do we use less as a default for arithmetic types with respect to a set? Wouldn't greater be just as sensible?

No, IMO, though for a lot of uses it probably wouldn't matter.

...

What about the other orderings? We only accept that it is "reasonable" now to have less be the default ordering because that was the decision that was made decades ago and people are used to it. It. Is. Just. A. Default.

-- -Matt Calabrese

-- Olaf

Matt Calabrese

7:44 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 10:58 AM, Olaf van der Spek <ml@vdspek.org> wrote:

...

IMO integers have a natural ordering, optionals don't.

Yes, you've said that.

...

The ones that we've been talking about for the entire thread in almost

...
every reply. I'm not going to repeat again.

vector? Doesn't use std:less for operator< does it? map/set?

Not vector. I was referring to set and map, whose use frequently doesn't depend on what the ordering actually is, but often only that there simply is an ordering. While on the topic though, the standard containers such as std::vector /do/ use std::less for their operator< (even though the requirements of the operation are [inconsistently] specified on the value type's operator< rather than their default ordering function). This was what I was alluding to when I said that the standard messes up requirement specification for its comparisons. To twist the knife in the standard committee's chest a bit, these types of requirement specification issues would not exist at all had we had not abandoned C++0x-style concepts (analogues of which are proving to work pretty well in Swift...). Checking of constrained templates via archetypes would catch this particular class of incorrectly constrained templates at compile-time, which is a feature that concepts-lite advocates decry as unimportant. The fact that the standard itself messes up requirement specification in this manner, despite the scrutiny it undergoes, should be reason enough to reconsider... Lots of types survive just fine without (default) ordering.

...

Simply surviving isn't all that one should strive for with respect to general, composite types that are defined in boost or the standard library. I think I'm still thinking about operator<

...

Other then map, do you have a reference for all these algorithms etc that would benefit from less<optional<>>?

Any algorithm that takes a comparator and has a default (there are plenty in the standard library). This is no different from the set and map cases where you often don't care about the specific order, but rather, only that an order exists. Depending on your specific use, this can be true with respect to algorithms such as std::sort and std::binary_search, for example.

...

lack of a default comparator for the types in question has already led to

...
the recommendation of preferring entirely different associative datastructures over passing in an explicit comparator in this very thread (ironically, the other containers that were recommended also depend on a subjective default, only it's the default hashing function instead of the default ordering function).

How is the hash function subjective w.r.t. semantics?

Not all hash functions are created equal and there are reasons to explicitly specify one as opposed to using the default, just as there are reasons to explicitly specify an ordering function.

...

...
You should not be afraid to have your types be ordered when they can be ordered. Most decisions in development have a subjective aspect to them and precisely which default ordering comparator you provide is just one of those many decisions. The fact that a default is subjective (and it always is, even for arithmetic types), does not mean that a default is not useful. Why do we use less as a default for arithmetic types with respect to a set? Wouldn't greater be just as sensible?

No, IMO, though for a lot of uses it probably wouldn't matter.

Why no? Why is "less" as a default objectively better than "greater" as a default? It's simply not, we just use it by convention. The same is true for why we frequently define other operators in terms of operator<. We could just as easily have decided that operator> was the "base" operation to be used as a default and around which we built other operators. < is nothing special and not intrinsically better. I understand that you personally intuitively like < best, but you're only kidding yourself if you think that there is something deeper. -- -Matt Calabrese

Olaf van der Spek

2 Dec 2 Dec

1:10 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 8:44 PM, Matt Calabrese <rivorus@gmail.com> wrote:

...

Not vector. I was referring to set and map, whose use frequently doesn't

Just two? That's not what I'd call many datastructures.

...

depend on what the ordering actually is, but often only that there simply is an ordering. While on the topic though, the standard containers such as std::vector /do/ use std::less for their operator< (even though the requirements of the operation are [inconsistently] specified on the value type's operator< rather than their default ordering function). This was

I'm not sure what you're saying. Requirements specify operator< but they're using std::less? I'm confused.

...

I think I'm still thinking about operator<

...
Other then map, do you have a reference for all these algorithms etc that would benefit from less<optional<>>?

Any algorithm that takes a comparator and has a default (there are plenty in the standard library). This is no different from the set and map cases where you often don't care about the specific order, but rather, only that an order exists. Depending on your specific use, this can be true with respect to algorithms such as std::sort and std::binary_search, for example.

True

...

...
lack of a default comparator for the types in question has already led to

...
the recommendation of preferring entirely different associative datastructures over passing in an explicit comparator in this very thread (ironically, the other containers that were recommended also depend on a subjective default, only it's the default hashing function instead of the default ordering function).

How is the hash function subjective w.r.t. semantics?

Not all hash functions are created equal and there are reasons to explicitly specify one as opposed to using the default, just as there are reasons to explicitly specify an ordering function.

True, but not an answer to the question.

...

...
...
You should not be afraid to have your types be ordered when they can be ordered. Most decisions in development have a subjective aspect to them and precisely which default ordering comparator you provide is just one of those many decisions. The fact that a default is subjective (and it always is, even for arithmetic types), does not mean that a default is not useful. Why do we use less as a default for arithmetic types with respect to a set? Wouldn't greater be just as sensible?

No, IMO, though for a lot of uses it probably wouldn't matter.

Why no? Why is "less" as a default objectively better than "greater" as a default?

Because in the majority of cases people order stuff in ascending order.

...

It's simply not, we just use it by convention. The same is true for why we frequently define other operators in terms of operator<. We could just as easily have decided that operator> was the "base" operation to be used as a default and around which we built other operators. < is nothing special and not intrinsically better. I understand that you personally intuitively like < best, but you're only kidding yourself if you think that there is something deeper.

True, operator> could've been used. But then default order would probably still have been ascending.. -- Olaf

Matt Calabrese

9:46 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Tue, Dec 2, 2014 at 5:10 AM, Olaf van der Spek <ml@vdspek.org> wrote:

...

On Mon, Dec 1, 2014 at 8:44 PM, Matt Calabrese <rivorus@gmail.com> wrote:

...
Not vector. I was referring to set and map, whose use frequently doesn't

Just two? That's not what I'd call many datastructures.

Well, technically 4 (multi-versions), but keep in mind that that is 100% of the ordered datastructures. This is also just in the standard. Other library developers can and do use std::less appropriately.

...

...
depend on what the ordering actually is, but often only that there simply is an ordering. While on the topic though, the standard containers such as std::vector /do/ use std::less for their operator< (even though the requirements of the operation are [inconsistently] specified on the value type's operator< rather than their default ordering function). This was

I'm not sure what you're saying. Requirements specify operator< but they're using std::less? I'm confused.

Yes, that is what I'm saying. In the C++ standard, when library components are specified, it describes concept requirements and contracts on types and functions involved. Sometimes it goes further and specifies more specific implementation details. In the case of the comparison operators, the standard states that the requirements are that operator< of the value type of the container must be defined and that it forms a total order. However, when it specifies what the operators should do, that specification is in terms of std::lexicographical_compare, using the default ordering (std::less). These two specifications do not match as std::less and operator< do not necessarily relate (they differ with respect to certain fundamental types and library components, not to mention that they can differ in user code). Either the requirements or the implementation are incorrectly specified. My jab regarding concepts was that the (good) C++0x concepts proposal could have potentially caught this kind of mistake when a vendor implemented the library because the definitions would have been constrained. However, this type of checking is not a part of concepts-lite and its design essentially precludes that checking from getting there in future revisions. Concepts-lite deems that kind of checking "unimportant" even though the standard could have benefited from the feature (not to mention other library developers).

...

...
Not all hash functions are created equal and there are reasons to explicitly specify one as opposed to using the default, just as there are reasons to explicitly specify an ordering function.

True, but not an answer to the question.

I'm not sure what you mean with respect to semantics in this context. Regarding ordering, the semantics that matter don't differ either.

...

Why no? Why is "less" as a default objectively better than "greater" as a

...
default?

Because in the majority of cases people order stuff in ascending order.

First, I'm not even sure that that's true and you certainly do not either (beyond that, it could be entirely incidental). Even if that were true, it also has no bearing on the discussion. It could be 99% of the time and that would make no difference. -- -Matt Calabrese

Matt Calabrese

3 Dec 3 Dec

2:33 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Tue, Dec 2, 2014 at 1:46 PM, Matt Calabrese <rivorus@gmail.com> wrote:

...

...
However, when it specifies what the operators should do, that specification is in terms of std::lexicographical_compare, using the default ordering (std::less). These two specifications do not match as std::less and operator< do not necessarily relate (they differ with respect to certain fundamental types and library components, not to mention that they can differ in user code). Either the requirements or the implementation are incorrectly specified.

My mistake, I was wrong here -- lexicographical_compare does use < by default and not std::less. -- -Matt Calabrese

Gavin Lambert

1 Dec 1 Dec

1:49 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 1/12/2014 13:58, Matt Calabrese wrote:

...

Anyway, why specifically do you consider a tuple having a default ordering as a bug? All it is is a lexicographical order, just like std::string and just like the other containers of the standard library. It just happens to be heterogenous and of a fixed size. Why do you consider it not "reasonable" to have a default ordering as a lexicographical ordering for tuples and yet consider it "reasonable" to have a default ordering as a lexicographical ordering for std::string? The distinction that is being made seems entirely arbitrary unless you or someone else can (1) specify unambiguously what that supposed difference actually is including how you would generalize that distinction, and (2) explain why such a difference is actually important with respect to defining a default ordering via std::less (or a proposed std::order). I don't buy the distinction being worthwhile either in practice or in theory. As far as I can tell it is arbitrary, inconsistent, and without any objective rationale.

Most containers do not implement ordering relations of the container itself. String is an exception because people generally think of it as a value type rather than as a container of chars. Why should a std::pair<iterator, bool> have an order? In what code would you ever try to sort by such a thing? Generically sortable tuples and variants are similarly meaningless. It's easy to add order comparisons in where required. It's hard to remove them once present. This tells me that they should not be included unless there is a clear advantage in doing so, and I am not convinced that this is the case.

...

I'm sorry, but that is simply bad programming advice. Putting a tuple in a set is not using the "wrong" datastructure. A tuple can be correctly ordered just like any other type and the same is true of optionals or variants. There not being a single default order that everyone would expect without referencing documentation is completely separate from the choice of whether a programmer should use a set or an unordered set. Claiming that this should be a reason to choose one datastructure over another is horrible.

Define "correctly ordered" for an arbitrary tuple. If this definition can vary from application to application or from context to context, then the definition should not be baked in to the tuple itself, but instead supplied only where actually needed. (It sounds like you are arguing that in this case the programmer should specify the tuple type specifically with its library-mandated ordering characteristics in mind, instead of whatever other logical field order they prefer. Especially if different orderings are required in different contexts, the compile-time type seems like the wrong place to be defining this.) If the programmer does not require ordering, then why use a container that requires it, instead of one that does not? Unless there is some other deficiency of unordered_set vs. set (other than relative newness and unfamiliarity) that I am unaware of, this seems obvious to me, and I'm not sure why you seem to oppose it so strongly. (To me, a "set" should use equality only and should never have been ordered in the first place, but that ship has long sailed and it's pointless to discuss it further.)

...

What's with the "want" here? Why would you not want an order? I can see not needing it for a certain application and not minding if it's there, but you make it sound like it's somehow a bad thing for it to even exist. If you personally don't need it then you don't have to use it.

Except that where this whole discussion began was where someone was using it unintentionally. If it wasn't there (unless specifically requested), that wouldn't be an issue. Perhaps your domain is different, but in my experience wanting to sort things (particularly in a single way) is the exception rather than the rule. Requiring types to be internally sortable therefore seems like an unnecessary burden. This is starting to get quite far afield from the original issue though. The fundamental point was that using op< on distinct types (in this case optional<T> and T) is more likely to be a bug than it is to be intentional code. My generalisation was that even op< on two optional<T>s is suspicious as some people don't seem to agree where "none" sorts, or even what "none" means in some cases -- and because others were arguing that it was inconsistent to define one op< without the other. My further generalisation was that it's probably also a bug to try to order two variant<X,Y> where one was X and the other Y. (Sure, there are occasions where it is useful to be able to do so. But the programmer should do that explicitly rather than using op<, due to the principle of least surprise, and because the required sort order may be different in different contexts.) And again, when I say "order" above I am referring only to op< and op> and friends. I am *not* referring to op== and op!=, which are much more sensible for all cases.

Matt Calabrese

3:57 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Sun, Nov 30, 2014 at 5:49 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

Most containers do not implement ordering relations of the container itself.

Yes they do. All of the standard containers provide lexicographical ordering.

...

String is an exception because people generally think of it as a value type rather than as a container of chars.

Why should a std::pair<iterator, bool> have an order? In what code would you ever try to sort by such a thing? Generically sortable tuples and variants are similarly meaningless.

Why are you so quick to declare something as meaningless simply because you personally don't immediately see the use? Putting those objects in a set is a meaningful use of ordering, and you've even agreed that it's fine to do so (though you have some kind of arbitrary preference for unordered containers for these constructs). You also said in this very thread that you don't use tuples, so why then do you feel as though you are so qualified as to make an authoritative claim that ordering them is meaningless? The fact is that all types can be ordered and there is nothing wrong with using that ordering in generic (or non-generic) code. All a pair or a tuple is is a container, much in the same way as the standard containers but with a separate concept. The difference is just that tuples and pairs are compile-time sized and heterogeneous while the standard containers are (mostly) run-time sized and homogeneous. As for why one might take advantage of such an ordering? I already gave you an example in this thread where I personally used the ordering of variants, though you mysteriously ignored it and again are make a claim that ordering is simply "meaningless." I have more examples if you want, since I actually use variants and tuples pretty much every day. So is ordering meaningful? Of course it is. It is in my concrete example and it is in more abstract, generic code where tuples and variants tend to manifest themselves either internally for storage or when producing results. As someone who actually uses these types in practice and who uses ordering of them, I am flat out telling you that ordering them is, in fact useful, and it does, in fact, make sense. It's easy to add order comparisons in where required. It's hard to remove

...

them once present.

Prefaced by the reiteration that I'm not talking about the operators, only std::less/std::order, just so we don't get side-tracked about the specific meaning of <. I don't really much care if < and the like are defined. I consider that separate from having a default ordering. That said, why do you think that it would be hard to remove the ordering once present? If order here were /actually/ meaningless as you repeatedly claim, then it would trivial to remove, since nobody would be using it in meaningful code.

...

Define "correctly ordered" for an arbitrary tuple. If this definition can vary from application to application or from context to context, then the definition should not be baked in to the tuple itself, but instead supplied only where actually needed.

"Correctly" meaning that it forms a mathematical order, as in it's not an erroneous definition and the order is well specified. Also, the ordering isn't "baked in," it's just a default. You /can/ explicitly specify an ordering when you need one that's different from the default. In generic code, having a default is particularly useful because frequently you do not care what the order is, only that it /is/ ordered so that you can do something like put it in a set, or sort data and binary search, etc.

...

(It sounds like you are arguing that in this case the programmer should specify the tuple type specifically with its library-mandated ordering characteristics in mind, instead of whatever other logical field order they prefer. Especially if different orderings are required in different contexts, the compile-time type seems like the wrong place to be defining this.)

No, that's not what I'm saying. A user /can/ do that if they choose, and there's nothing wrong with that (my variant example is a reasonable example of this). For all types, including built-in arithmetic types, the default order is not always what is desired. So what? It is a default. That's all. To reiterate again, code frequently requires order and users don't necessarily care what that order is. Pointers are a great example of that. Having a default is useful and it doesn't cause harm. If the programmer does not require ordering, then why use a container that

...

requires it, instead of one that does not? Unless there is some other deficiency of unordered_set vs. set (other than relative newness and unfamiliarity) that I am unaware of, this seems obvious to me, and I'm not sure why you seem to oppose it so strongly.

They are entirely different datastructures with different complexities and practical implementation implications both in terms of storage and run-time. The unordered containers are not "deficient," they are just different datastructures entirely and you should not base your decision on which container to use just because someone chose there to not be a default ordering for a given type. Not only should you not "just" make the choice based on the lack of a default, but it shouldn't ever be a part of the consideration at all whether there is a default or not. Also, it's not simply a matter of choosing between two containers where one has more strict requirements than another as you seem to have just implied by "If the programmer does not require ordering, then why use a container that requires it, instead of one that does not?". The unordered containers don't have a subset of the requirements of the ordered ones, the requirements are just different. Specifically, unordered containers require a hashing function while ordered containers require an ordering function. I could very easily just say "why would someone use a container that requires a hashing function instead of one that does not?" but I wouldn't because that would be just as flawed reasoning. Anyway, even if we really were dealing with with a superset of requirements, I don't buy that you should prefer the container that has fewer requirements on principle. in practice, you usually want to do the opposite and favor the implementation with more-strict requirements, since that usually means for a more specialized implementation. This is because the implementation can take advantage of more functionality and/or more specific semantics. A good example of that for an algorithm is std::advance, and for datastructures, see any datastructure that can take advantage of triviality. The thing is, absolutely none of this talk of requirements really applies here, though, since as I pointed out, the ordered and unordered containers are completely different datastructures and so comparison of their requirements is misleading. In the case at hand, the types /can/ be ordered, so whether or not those types have a /default/ order should have no bearing on whether you use an ordered or unordered set. Both of those choices are perfectly valid and the decision to choose one or the other has nothing to do with defaults.

...

Except that where this whole discussion began was where someone was using it unintentionally. If it wasn't there (unless specifically requested), that wouldn't be an issue.

No, people were misusing /operators/ not std::less -- particularly the mixed-type operators. We are not talking about operators here as I've tried to restate in several replies just to be clear. We are talking about default ordering for a type (I.E. specifying a std::less specialization or some proposed std::order). Since we're on the topic now, though, at some point a claim was made that the mixed-type comparisons can come about due to there being implicit conversions and it derailed into removing the homogeneous operators because of that. As I mentioned earlier, this conversion issue isn't actually true since when you define your operator as a template and you are not explicitly specifying a template argument when invoking it (as few people would rarely do, unless perhaps they want such a conversion), you will /not/ get the implicit conversion. That overload will not be picked. If people were seeing such conversions, I assume that the operators were defined as friend functions inline to the class. If that is the only issue that people have, then simply changing the operators to be templates and defining them outside of the class would fix that issue. Perhaps your domain is different, but in my experience wanting to sort

...

things (particularly in a single way) is the exception rather than the rule. Requiring types to be internally sortable therefore seems like an unnecessary burden.

No, there is nothing different here and I agree entirely. You should always minimize requirements. Perhaps what you're missing is that we're /not/ requiring anything at the container level (I.E. tuple or optional or variant). The requirements are only on the ordering function. Your types do not need to have a default order to put them in a std::vector or a std::tuple or an optional or a variant. The ordering function is what has the requirement, not the container itself. This has always been true in the standard library with respect to the standard containers (it's even true with respect to equality, not just ordering). If your contained types don't have a well-defined < that provides a total order then the container type doesn't have a well-defined order either and that's fine. For instance, you can put stuff in a std::vector if it doesn't have a well-defined == or <. If you choose to use the operators, that's another story, but that set of requirements is separate from the container requirements.

...

This is starting to get quite far afield from the original issue though. The fundamental point was that using op< on distinct types (in this case optional<T> and T) is more likely to be a bug than it is to be intentional code.

I agree that use is generally suspect.

...

My generalisation was that even op< on two optional<T>s is suspicious as some people don't seem to agree where "none" sorts, or even what "none" means in some cases -- and because others were arguing that it was inconsistent to define one op< without the other.

Here's where I tend to disagree (I'll go back to talking about order in general, though, so that we don't get derailed regarding operators that should or should not exist). Specifying ordering for any type at all is always a subjective decision. If someone sees an order comparison with an optional, no matter how that function is spelled, should that programmer A) make an assumption about ordering and not look it up or B) look up the ordering? Whether the function is spelled operator< or std::less or std::order seems unimportant to me in that particular respect. In every one of those cases the programmer would have to look up what the order is if they wanted to use it in a specific way. Because of that, I don't think it's a valid rationale to exclude the function. You always need to know what any function does when you use it. A default ordering function isn't particularly special.

...

My further generalisation was that it's probably also a bug to try to order two variant<X,Y> where one was X and the other Y. (Sure, there are occasions where it is useful to be able to do so. But the programmer should do that explicitly rather than using op<, due to the principle of least surprise, and because the required sort order may be different in different contexts.)

Again, the required sort order being different in different contexts is always true. This is not special to optional or variant or tuple or anything else. It's true for integers just the same. So what? If the default order is sensible for your use then use it. If it's not then don't use it. -- -Matt Calabrese

Gavin Lambert

7:12 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 1/12/2014 16:57, Matt Calabrese wrote:

...

On Sun, Nov 30, 2014 at 5:49 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...
Most containers do not implement ordering relations of the container itself.

Yes they do. All of the standard containers provide lexicographical ordering.

I stand corrected. I must admit to being unaware that these existed; I think I missed them in the C++ library docs I was looking at as they listed non-member functions in a completely separate location. Perhaps that does undermine some of my arguments. :)

...

As for why one might take advantage of such an ordering? I already gave you an example in this thread where I personally used the ordering of variants, though you mysteriously ignored it and again are make a claim that ordering is simply "meaningless." I have more examples if you want, since I actually use variants and tuples pretty much every day. So is ordering meaningful? Of course it is. It is in my concrete example and it is in more abstract, generic code where tuples and variants tend to manifest themselves either internally for storage or when producing results. As someone who actually uses these types in practice and who uses ordering of them, I am flat out telling you that ordering them is, in fact useful, and it does, in fact, make sense.

Perhaps "meaningless" was the wrong word. I meant something closer to "unobvious", as in the ordering relationship is less visible to a casual reader of the code, and could be easily misinterpreted or broken by a future programmer. Still, I suppose that's what unit tests are for (when they actually exist).

...

That said, why do you think that it would be hard to remove the ordering once present? If order here were /actually/ meaningless as you repeatedly claim, then it would trivial to remove, since nobody would be using it in meaningful code.

If a library has defined operator<(T,T), then that operator exists and there is (as far as I am aware) no way to remove it short of editing the library code. Conversely if the library has not defined the operator, there is nothing (bar namespace etiquette, perhaps) preventing the application from doing so itself, as long as it doesn't need private data to do so. I suppose that could be resolved by an application that really wants to get rid of the ordering relation defining a new class that delegates the original and mirrors its interface apart from that group of methods/operators, but this seems like a massive headache for everybody and doesn't really solve unintended mixups.

...

The thing is, absolutely none of this talk of requirements really applies here, though, since as I pointed out, the ordered and unordered containers are completely different datastructures and so comparison of their requirements is misleading. In the case at hand, the types /can/ be ordered, so whether or not those types have a /default/ order should have no bearing on whether you use an ordered or unordered set. Both of those choices are perfectly valid and the decision to choose one or the other has nothing to do with defaults.

As I said, the idea was that while you *could* define a default sort, that would be surprising in many cases (eg. given a variant<int, double>, 78 would be less than 42.5), so putting it into an ordered container will not necessarily have the expected ordering. I suppose there are still performance benefits from being able to eg. binary search based on some order, even if it's not the final display order, though.

...

No, people were misusing /operators/ not std::less -- particularly the mixed-type operators. We are not talking about operators here as I've tried to restate in several replies just to be clear. We are talking about default ordering for a type (I.E. specifying a std::less specialization or some proposed std::order).

I'm not sure I want to live in a world where std::less wasn't equivalent to operator<, where the latter exists. (At least at the library layer; if the application wants to mess with it that's their own business.)

...

...
My generalisation was that even op< on two optional<T>s is suspicious as some people don't seem to agree where "none" sorts, or even what "none" means in some cases -- and because others were arguing that it was inconsistent to define one op< without the other.

Here's where I tend to disagree (I'll go back to talking about order in general, though, so that we don't get derailed regarding operators that should or should not exist). Specifying ordering for any type at all is always a subjective decision. If someone sees an order comparison with an optional, no matter how that function is spelled, should that programmer A) make an assumption about ordering and not look it up or B) look up the ordering? Whether the function is spelled operator< or std::less or std::order seems unimportant to me in that particular respect. In every one of those cases the programmer would have to look up what the order is if they wanted to use it in a specific way. Because of that, I don't think it's a valid rationale to exclude the function. You always need to know what any function does when you use it. A default ordering function isn't particularly special.

The case in question is where you were using it without realising it; in the stated example perhaps when code that used to have int parameters was changed to optional<int> without amending the operator use accordingly. That's a programmer error, of course, but it would have been nice if it were also a compiler error. (Sometimes it's less obvious, when there are layers of typedefs and metafunctions in the way, especially when these come from external code.) Admittedly the compiler can't catch everything and it's a philosophical question on how much "hand holding" the library should do to help avoid mistakes, but this question did arise in the context of making a "safer" API for optional. Perhaps I ran too far with the ball, but sometimes that's how you learn where the lines in the sand are. (Have I mixed enough metaphors yet?)

Matt Calabrese

9:31 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Sun, Nov 30, 2014 at 11:12 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

I'm not sure I want to live in a world where std::less wasn't equivalent to operator<, where the latter exists. (At least at the library layer; if the application wants to mess with it that's their own business.)

I actually tend to agree. I know that as things are now, std::less and operator< have different meaning and I think we need to play by those rules, but that to me has always been a bit wonky, too. If std::less were simply called std::order, that would make things much better. To be honest, though, even separate from that, I'm also in the camp with whoever it was that said that we should simply accept that operator< isn't necessarily "less" but rather is a generalization of less that is considered to mean "order," and which just so happens to be less when in the domain of the built-in arithmetic types. Even then there are still problems, though, because floating point types have things like NaNs that behave in a way that means that neither operator< nor std::less provide order, which is a much bigger horror than < not strictly meaning "less," IMO. At least for pointers we have std::less appropriately specialized. As things are now, you can't even use std::sort on a range of floats or doubles with the default order if there is a chance that you have NaNs, otherwise you have undefined behavior (and it's not even benign undefined behavior in common implementations). With respect to the standard, we have to always be careful to make sure operator< is defined in terms of operator< (similar for other operators) and std::less and family are defined in terms of themselves. Requirement specifications need to accurately reflect this, too. It's unfortunate that even the standard library messes this stuff up when it can on both accounts. It's all unnecessarily confusing and error prone. Tony made very positive progress with respect to optional ordering, but ultimately I think the standard should change at a more basic level as well. There are very few people out there who actually care or know enough to properly define these operations with respect to generic composite types, and the standard is partly to blame for that.

...

...
The case in question is where you were using it without realising it; in the stated example perhaps when code that used to have int parameters was changed to optional<int> without amending the operator use accordingly.

That's a programmer error, of course, but it would have been nice if it were also a compiler error. (Sometimes it's less obvious, when there are layers of typedefs and metafunctions in the way, especially when these come from external code.)

I think the specific problem in the original example could actually be remedied with a compiler error, though, while still providing operator< for optional<T>, optional<T> if it is still desired. In the example, the person compared a T with an optional<T>. It was suggested by someone that even if operator overloads for optional<T>, T were removed, that the code would still compile because of the implicit conversion, which isn't actually true. The code would not compile when the operator is specified to be a template as opposed to an inline friend. It seems to me that the least controversial and least-likely way to break valid code and yet also make sure that the problem-case produces an error, if that is desirable, is to just remove any optional<T>, T overloads and make the optional<T>, optional<T> overloads templates. -- -Matt Calabrese

Gottlob Frege

28 Nov 28 Nov

7:22 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Thu, Nov 27, 2014 at 8:08 PM, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...

...
...
0 - std::less should be same as op< (I think we all agree here, actually, which is why we should have std::order)

Only if op< id defined.

Do you mean only *the same* if op< is defined, or only *exists* if op< is defined? If op< isn't defined (ie complex<> or, technically, pointers) do you still want std::less?

...

...
...
1 - op<(optional<T>, optional<T>) is "sensible" but somewhat arbitrary

I don't agree here. We don't need anything arbitrary. op<optional<T>,optional<T>) must be defined only if op<(T,T> is defined.

It is arbitrary in the sense of where "none" is ordered. I (strongly!) agree with the only if op<(T,T> is defined. (And famously believe op>=(optional<T>,optional<T>) should be based on op>=(T,T), instead of !op<(T,T) )

...

...
...
2 - op<(optional<T>, T) is questionable, often (usually?) wrong

Agreed.

...
...
3 - in general, func(optional<T>, optional<T>) should work if passed T's (ie implicit conversion is important)

I have my doubts here. Implicit conversions provide often (usually?) unexpected behavior.

I think we need a general guideline for the std library for when implicit is OK. ie Is it OK for dumb_ptr? string_view? etc. That's something I'd like to work on. I _think_ there is a general guideline, but maybe it is too case-by-case?

...

...
...
4 - op<(optional<T>, optional<T>) *is* of the form func(optional<T>, optional<T>) - ie why should it be different?

Nothing to say here ;-)

...
Currently we have std::less, not std::order and the STL ordered containers are using as default comparator std::less. So let define std::less<optional<T>> using std::less<T>.

Yes. We currently have that much - std::less<optional<T>> is built with std::less<T>, not op<(T,T). Even if only for the sake of pointers. On almost-non-existent hardware, (and maybe future hardware). Tony

Matt Calabrese

8:12 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Fri, Nov 28, 2014 at 11:22 AM, Gottlob Frege <gottlobfrege@gmail.com> wrote:

...

I (strongly!) agree with the only if op<(T,T> is defined. (And famously believe op>=(optional<T>,optional<T>) should be based on op>=(T,T), instead of !op<(T,T) )

Just want to say that I think this is important and was really upset when tuple, IMHO, got it wrong. -- -Matt Calabrese

Gottlob Frege

1 Dec 1 Dec

6:49 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Fri, Nov 28, 2014 at 3:12 PM, Matt Calabrese <rivorus@gmail.com> wrote:

...

On Fri, Nov 28, 2014 at 11:22 AM, Gottlob Frege <gottlobfrege@gmail.com> wrote:

...
I (strongly!) agree with the only if op<(T,T> is defined. (And famously believe op>=(optional<T>,optional<T>) should be based on op>=(T,T), instead of !op<(T,T) )

Just want to say that I think this is important and was really upset when tuple, IMHO, got it wrong.

Well, for tuples of more than one member, there is no nice way of generating all the relations *without making an assumption about how the relations are related*. So I'm fine with (and have tried to argue that) tuple, pair, vector, etc are a different category from optional, expected, variant, any,... ie one is a single-value wrapper, the other is a combination of values. one requires lexicographical, which requires assumptions, the other doesn't, and could/should thus use the proper underlying relations. Still working on it. Tony

Vicente J. Botet Escriba

29 Nov 29 Nov

10:58 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

Le 28/11/14 20:22, Gottlob Frege a écrit :

...

On Thu, Nov 27, 2014 at 8:08 PM, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...
...
...
0 - std::less should be same as op< (I think we all agree here, actually, which is why we should have std::order) Only if op< id defined. Do you mean only *the same* if op< is defined, or only *exists* if op< is defined? op<optional<T> is defined only if op<T is defined. If op< isn't defined (ie complex<> or, technically, pointers) do you still want std::less?

std::less<optional<T>> is defined only if std::less<T> id defined.

...

...
...
...
1 - op<(optional<T>, optional<T>) is "sensible" but somewhat arbitrary

I don't agree here. We don't need anything arbitrary. op<optional<T>,optional<T>) must be defined only if op<(T,T> is defined. It is arbitrary in the sense of where "none" is ordered.

Agreed. we have decided that optional<T> is some way none_t+T and not T+none_t.

...

I (strongly!) agree with the only if op<(T,T> is defined. (And famously believe op>=(optional<T>,optional<T>) should be based on op>=(T,T), instead of !op<(T,T) )

Agree again.

...

...
...
...
2 - op<(optional<T>, T) is questionable, often (usually?) wrong Agreed. 3 - in general, func(optional<T>, optional<T>) should work if passed T's (ie implicit conversion is important) I have my doubts here. Implicit conversions provide often (usually?) unexpected behavior. I think we need a general guideline for the std library for when implicit is OK. ie Is it OK for dumb_ptr? string_view? etc. That's something I'd like to work on. I _think_ there is a general guideline, but maybe it is too case-by-case?

We have worked a lot with implicit conversion as C++98 did have explicit ones. I would say that the conversion should be explicit by default. Implicit conversion should be allowed only when there is a sub-type relationships <: between the types. This sub-type relationship should satisfy: Anti-symetric: If we have types R, S such that R <: S with implicit conversions from R to S , the conversion from S to R can not be implicit (no implicit cycles), however there should be an explicit conversion from S to R (coercion). Transitivity: If we have types R, S, T such that R <: S <: T with implicit conversions from R to S and from S to T, there should be also an implicit conversion from R to T. That is R <: T. Note that C++ conversions are not transitive. Identity: R ° S = identity If we have types R, S such that R <: S For any r:R R(S(r)) == r. That is, there is no loss of information. Efficiency The cost of the implicit conversion from the subtype to the super type is almost null. Conversions that implies a high conversion cost must be explicit. Refinement The conversions from/to the super type must be defined on the sub-type (this is in line with OO sub-typing, we can refine, but not generalize). Defining implicit conversion on the super-type side would invalidate valid programs. If we had class S void f(S); and now we define a subtype R of S, R <: S, with an implicit conversion from R to S. R r; f(r); // well formed Now if we define a super type of T fo S, S <: T, that defines implicit conversions from S to T and from R to S, and add a generalization for f void f(T); The previous program would not be any more well formed R r; f(r); // ambiguous conversion f(S) and f(T) That is, super typing is not allowed. This point should be the most controversial, but I think it is the most important :(

...

...
...
Currently we have std::less, not std::order and the STL ordered containers are using as default comparator std::less. So let define std::less<optional<T>> using std::less<T>.

Yes. We currently have that much - std::less<optional<T>> is built with std::less<T>, not op<(T,T). Even if only for the sake of pointers. On almost-non-existent hardware, (and maybe future hardware).

I'm missing the wording for the definition of std::less<optional<T>> in function of std::less<T>. Could you point me where this is described? Vicente

Gottlob Frege

9:13 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Sat, Nov 29, 2014 at 5:58 AM, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...

We have worked a lot with implicit conversion as C++98 did have explicit ones. I would say that the conversion should be explicit by default. Implicit conversion should be allowed only when there is a sub-type relationships <: between the types. This sub-type relationship should satisfy:

Anti-symetric: If we have types R, S such that R <: S with implicit conversions from R to S , the conversion from S to R can not be implicit (no implicit cycles), however there should be an explicit conversion from S to R (coercion).

Does the coercion need to be via explicit constructor, or can it be a function? ie shared_ptr::get() ? As "test cases", I think shared_ptr and unique_ptr need explicit from-ptr constructors (for safety), but (IMO) dumb_ptr does not. Do your rules agree?

...

Transitivity: If we have types R, S, T such that R <: S <: T with implicit conversions from R to S and from S to T, there should be also an implicit conversion from R to T. That is R <: T. Note that C++ conversions are not transitive.

Identity: R ° S = identity If we have types R, S such that R <: S For any r:R R(S(r)) == r.

That is, there is no loss of information.

Efficiency The cost of the implicit conversion from the subtype to the super type is almost null. Conversions that implies a high conversion cost must be explicit.

Refinement The conversions from/to the super type must be defined on the sub-type (this is in line with OO sub-typing, we can refine, but not generalize). Defining implicit conversion on the super-type side would invalidate valid programs. If we had

class S void f(S);

and now we define a subtype R of S, R <: S, with an implicit conversion from R to S.

R r; f(r); // well formed

Now if we define a super type of T fo S, S <: T, that defines implicit conversions from S to T and from R to S, and add a generalization for f

void f(T);

The previous program would not be any more well formed

R r; f(r); // ambiguous conversion f(S) and f(T)

That is, super typing is not allowed. This point should be the most controversial, but I think it is the most important :(

...
...
...
Currently we have std::less, not std::order and the STL ordered containers are using as default comparator std::less. So let define std::less<optional<T>> using std::less<T>.

Yes. We currently have that much - std::less<optional<T>> is built with std::less<T>, not op<(T,T). Even if only for the sake of pointers. On almost-non-existent hardware, (and maybe future hardware).

I'm missing the wording for the definition of std::less<optional<T>> in function of std::less<T>. Could you point me where this is described?

Grrrrrrrrrrrrr. I give up. It was there when I last argued for it. Not sure when it got removed. :-( Tony Van Eerd

Vicente J. Botet Escriba

11:58 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

...

On Sat, Nov 29, 2014 at 5:58 AM, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:

...
We have worked a lot with implicit conversion as C++98 did have explicit ones. I would say that the conversion should be explicit by default. Implicit conversion should be allowed only when there is a sub-type relationships <: between the types. This sub-type relationship should satisfy:

Anti-symetric: If we have types R, S such that R <: S with implicit conversions from R to S , the conversion from S to R can not be implicit (no implicit cycles), however there should be an explicit conversion from S to R (coercion).

Does the coercion need to be via explicit constructor, or can it be a function? ie shared_ptr::get() ? I have no problem with the name of the coercion function. However, when

...

As "test cases", I think shared_ptr and unique_ptr need explicit from-ptr constructors (for safety), Tony, maybe you could turn you safety concern on a a new rule/guideline ;-) but (IMO) dumb_ptr does not. No observed_ptr (old dump_ptr) has an explicit constructor from the

Le 29/11/14 22:13, Gottlob Frege a écrit : things are uniform we can add generic checkers. The case of smart pointer is weird case as the stored value is a pointer to the value :( As if we had SmartPtr<T*>). Note that you have also the explicit pointer conversion. The problem is the syntax as the wrapped type is T* typedef int* int_raw_ptr; int* p = new int(24); assert(int_raw_ptr(unique_ptr<int>(p)) == p); pointer :(

...

Do your rules agree?

No. No smart pointer can be considered as a sub-type of the pointed type. If we want to allow implicit conversion between wrapped types and wrapper types my initial rules should be extended. But I don't think we should do it. I think we need an explicit conversion in this case. As you maybe know, I'm all for the convert function [1], so that you state explicitly that you are doing a conversion to the target type. void f(dump_ptr<int>); int* p= new int(24); f(p); // compile fails f(convert(p)); // just works without any mention to an explicit dump_ptr<int> conversion I would even prefer a variation of [2] f( explicit {p} ); where here explicit is considered as a replacement of the type parameter, in this case dump_ptr<int>. [1] N3521 - About convert() utility function http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3521.html [2] 4074 - Let return {expr} Be Explicit, Revision 2 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4074.pdf

...

...
...
...
Currently we have std::less, not std::order and the STL ordered containers are using as default comparator std::less. So let define std::less<optional<T>> using std::less<T>.

Yes. We currently have that much - std::less<optional<T>> is built with std::less<T>, not op<(T,T). Even if only for the sake of pointers. On almost-non-existent hardware, (and maybe future hardware).

I'm missing the wording for the definition of std::less<optional<T>> in function of std::less<T>. Could you point me where this is described?

Grrrrrrrrrrrrr. I give up. It was there when I last argued for it. Not sure when it got removed. :-(

I have already signaled it to std-proposals ML. I hope this will be an editorial issue. Best, Vicente

Gottlob Frege

28 Nov 28 Nov

7:01 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Thu, Nov 27, 2014 at 5:44 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

I'm not sure I understand the meaning of having an order that isn't implied by op<. If op< is omitted because comparison is meaningless or otherwise confusing, how could it be well-defined how to sort them? If there is not a single well-defined sort order, then the library should not just pick one arbitrarily, and if there is, then why would it not be appropriate as an op<?

...

I don't agree with any of that. If you want to use it as a key for something map/set-like, use unordered_map/set. I can think of multiple possible orderings of std::complex depending on application need, and I don't think any one of them is inherently "better" than any other. So this should be left to the application to define the sort order, in the rare case that this would actually be needed.

Maybe unordered_map has lessened the need for "representational ordering"[*], but it hasn't eliminated it. [*] like representational equality, see Stepanov's Elements of Programming. People like Sean Parent and Stepanov suggest specializing std::less with a representational ordering when a "meaningful" op< doesn't exist. There are still reasons to use std::map over unordered_map. Lack of a cryptographically safe hash is one of them. There are others (that I've forgotten, but I've asked the same question to committee members before, and there were a few reasons that sounded valid to me.) Basically, people still want complex (for example) as a key in std::map. Maybe that is becoming rare enough that they can just pass in an explicit comparison when needed, but not always easy in generic code (more on that below).

...

...
optional<T> is Regular if T if Regular. The more that optional<T> "just works" like int works, the better. Particularly for generic code.

The first part sounds like a definition I am unfamiliar with.

Regular is a concept. See Stepanov's Elements of Programming. Or just think "int". ie assignable, copyable, copies are disjoint, etc. Stepanov includes ordering as part of Regular (IIRC).

...

I agree with the second part, but I don't think it's relevant to this discussion. Nobody is talking about removing op<(optional<T>,optional<T>),

well, I am including it as a suggestion, with only retaining std::less.

...

which is what generic code would use.

Generic code (like STL) uses std::less. And if T might be a pointer, you should too, technically.

...

...
1 - op<(optional<T>, optional<T>) is "sensible" but somewhat arbitrary 2 - op<(optional<T>, T) is questionable, often (usually?) wrong 3 - in general, func(optional<T>, optional<T>) should work if passed T's (ie implicit conversion is important) 4 - op<(optional<T>, optional<T>) *is* of the form func(optional<T>, optional<T>) - ie why should it be different?

All true.

...
2 and 3, to me, are the strongest, but directly opposed. How do you solve that? Either say

A1. "well operators are special, not typical functions" so we can poison them while keeping general functions implicit A2. 1 is on shaky ground, so let's remove both 1 and 2, but keep std::less/std::order, which was the main reason for having op<.

Of these, I prefer A1. I think it really is a special case. But I can understand why the present behaviour is as it is.

I suspect those who dislike "none" having a defined order would go for A2 though. I would be ok with that as well, but I don't have a problem with this definition.

But you're leaving out A3: remove 1+2 AND std::less/std::order. This means that optional types can't be used in map keys, only in unordered_map. I actually think that this is the best option, except that it has a greater potential to break existing code. Perhaps a predicate could be defined specific to optional giving the current ordering, so that users who do want that ordering or to continue using a map could just specify that predicate as a minimal fix. But I don't think that the predicate should be spelled "std::less".

I agree A3 is the other option, and I should have listed it. However, I think it is a non-starter. Even though we have unordered_map, the committee strongly felt that optional should have some form of std::less (either through op< or directly). Maybe that is just "old school thinking" because we are too use to using std::map instead of unordered_map, but, as I mentioned above, there were other reasons given, and, IIRC, the committee was pretty set on having std::less work somehow. Tony

Gottlob Frege

7:14 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

...

Regular is a concept. See Stepanov's Elements of Programming. Or just think "int". ie assignable, copyable, copies are disjoint, etc. Stepanov includes ordering as part of Regular (IIRC).

P.S. One of the important parts of Regular: it is what the STL typically assumes. In Stepanov's world (almost) all types are Regular. (not that he thinks everything is Regular, but that all the good things are :-) ie he dislikes and avoids anything that isn't. Non-Regular should be the rare exception. you should strive to make your class Regular.)

Gavin Lambert

30 Nov 30 Nov

11:20 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 29/11/2014 08:01, Gottlob Frege wrote:

...

There are still reasons to use std::map over unordered_map. Lack of a cryptographically safe hash is one of them. There are others (that I've forgotten, but I've asked the same question to committee members before, and there were a few reasons that sounded valid to me.)

Why should the lack of a cryptographically safe hash matter when you are not doing cryptography? It doesn't really matter what hash is used in internal data structures. (Although obviously there are desirable properties such as having uniform spread to minimise bucket collisions and improve lookup speed.) In fact in most cases the best hash is the fastest one, not the most secure one.

...

Basically, people still want complex (for example) as a key in std::map. Maybe that is becoming rare enough that they can just pass in an explicit comparison when needed, but not always easy in generic code (more on that below).

I still question whether these people *really* want this, or whether it's just because unordered_map is new and scary (or longer to type). Obviously I suspect the latter.

...

...
...
optional<T> is Regular if T if Regular. The more that optional<T> "just works" like int works, the better. Particularly for generic code.

The first part sounds like a definition I am unfamiliar with.

Regular is a concept. See Stepanov's Elements of Programming. Or just think "int". ie assignable, copyable, copies are disjoint, etc. Stepanov includes ordering as part of Regular (IIRC).

Thank you for the clarification. I agree that this is a desirable property for "value-like" types, which includes optionals and variants. I don't agree that ordering should be included in that set, however (but equality should). Structures, for example, should ideally be assignable/copyable/equality-testable etc and be treated value-like in the same way. But most of the time defining an ordering for a structure is meaningless. The same is true for variant; I don't think it makes sense to compare a variant that holds type #4 with one that holds type #7 and somehow declare that the first is less based just on the types. Optional is more borderline. I don't really have a problem with comparing two optionals, or even the default "none is less than anything" relation. But when you start mixing comparisons with implicitly-promoted-non-optionals you increase the risk of unintended bugs (eg. opt < 5 is true but the intended comparison was opt && *opt < 5, which is false).

Gavin Lambert

11:56 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

Mere moments ago, quoth I:

...

Optional is more borderline. I don't really have a problem with comparing two optionals, or even the default "none is less than anything" relation. But when you start mixing comparisons with implicitly-promoted-non-optionals you increase the risk of unintended bugs (eg. opt < 5 is true but the intended comparison was opt && *opt < 5, which is false).

I should probably clarify that I meant that if op< exists, I am most comfortable with "none" sorting below any other value rather than any of the alternatives. But I would be happier if "none" had no ordering and op< did not exist.

Gottlob Frege

1 Dec 1 Dec

6:45 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Sun, Nov 30, 2014 at 6:20 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...

On 29/11/2014 08:01, Gottlob Frege wrote:

...
There are still reasons to use std::map over unordered_map. Lack of a cryptographically safe hash is one of them. There are others (that I've forgotten, but I've asked the same question to committee members before, and there were a few reasons that sounded valid to me.)

Why should the lack of a cryptographically safe hash matter when you are not doing cryptography?

It doesn't really matter what hash is used in internal data structures. (Although obviously there are desirable properties such as having uniform spread to minimise bucket collisions and improve lookup speed.)

Denial of Service attack - I carefully force the input data such that the hashes collide and you get worse-case hash-table performance. This is a real attack. Python and a few other languages have already fixed their hashes, we have not. So that still doesn't apply to every app, but it applies to more than you might expect.

...

In fact in most cases the best hash is the fastest one, not the most secure one.

...
Basically, people still want complex (for example) as a key in std::map. Maybe that is becoming rare enough that they can just pass in an explicit comparison when needed, but not always easy in generic code (more on that below).

I still question whether these people *really* want this, or whether it's just because unordered_map is new and scary (or longer to type). Obviously I suspect the latter.

As I said,the hash was only one reason. There were others, IIRC. I do agree that unordered_map will become more popular in the future, so that the issue is lessened. (Also, map should have been called ordered_map, and unordered_map called 'map'...) Tony

Olaf van der Spek

6:48 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 7:45 PM, Gottlob Frege <gottlobfrege@gmail.com> wrote:

...

On Sun, Nov 30, 2014 at 6:20 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...
On 29/11/2014 08:01, Gottlob Frege wrote:

...
There are still reasons to use std::map over unordered_map. Lack of a cryptographically safe hash is one of them. There are others (that I've forgotten, but I've asked the same question to committee members before, and there were a few reasons that sounded valid to me.)

Why should the lack of a cryptographically safe hash matter when you are not doing cryptography?

It doesn't really matter what hash is used in internal data structures. (Although obviously there are desirable properties such as having uniform spread to minimise bucket collisions and improve lookup speed.)

Denial of Service attack - I carefully force the input data such that the hashes collide and you get worse-case hash-table performance. This is a real attack. Python and a few other languages have already fixed their hashes, we have not.

True, but maps suffer from the same problem don't they? Is a fix even available for maps? -- Olaf

Gottlob Frege

6:56 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Mon, Dec 1, 2014 at 1:48 PM, Olaf van der Spek <ml@vdspek.org> wrote:

...

On Mon, Dec 1, 2014 at 7:45 PM, Gottlob Frege <gottlobfrege@gmail.com> wrote:

...
On Sun, Nov 30, 2014 at 6:20 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...
On 29/11/2014 08:01, Gottlob Frege wrote:

...
There are still reasons to use std::map over unordered_map. Lack of a cryptographically safe hash is one of them. There are others (that I've forgotten, but I've asked the same question to committee members before, and there were a few reasons that sounded valid to me.)

Why should the lack of a cryptographically safe hash matter when you are not doing cryptography?

It doesn't really matter what hash is used in internal data structures. (Although obviously there are desirable properties such as having uniform spread to minimise bucket collisions and improve lookup speed.)

Denial of Service attack - I carefully force the input data such that the hashes collide and you get worse-case hash-table performance. This is a real attack. Python and a few other languages have already fixed their hashes, we have not.

True, but maps suffer from the same problem don't they? Is a fix even available for maps?

I'm not an expert here; I thought map was "safe" due to its self-balancing. from overflow: map, set, multimap, and multiset Insertion: O(log n) Lookup: O(log n) Deletion: O(log n) hash_map, hash_set, hash_multimap, and hash_multiset Insertion: O(1) expected, O(n) worst case Lookup: O(1) expected, O(n) worst case Deletion: O(1) expected, O(n) worst case ie map is still logN even in worst case.

...

-- Olaf

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Niall Douglas

2 Dec 2 Dec

8:26 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 1 Dec 2014 at 13:56, Gottlob Frege wrote:

...

...
...
Denial of Service attack - I carefully force the input data such that the hashes collide and you get worse-case hash-table performance. This is a real attack. Python and a few other languages have already fixed their hashes, we have not.

True, but maps suffer from the same problem don't they? Is a fix even available for maps?

I'm not an expert here; I thought map was "safe" due to its self-balancing.

map<> is usually a red-black tree, these suffer from exponential performance loss when written to by multiple threads due to enormous cache line invalidation caused by the rebalance. It is quite trivial to DDoS any network service stupid enough to write to a std::map from more than one thread. Just add any load at all. Note that mostly read and very rare writes from multiple thread users of a std::map are just fine, but a std::unordered_map will be that much better again in that scenario. It would be nice to have a bitwise_trie STL container, it would provide almost all the benefits of std::map but be enormously superior in every facet apart from closest fit finds. In particular, bitwise tries let you do constant time "close fit" finds whereas a red black tree goes all the way and forces closest fit finds when for some use cases that is overkill. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Niall Douglas

8:11 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 1 Dec 2014 at 13:45, Gottlob Frege wrote:

...

...
Why should the lack of a cryptographically safe hash matter when you are not doing cryptography?

It doesn't really matter what hash is used in internal data structures. (Although obviously there are desirable properties such as having uniform spread to minimise bucket collisions and improve lookup speed.)

Denial of Service attack - I carefully force the input data such that the hashes collide and you get worse-case hash-table performance. This is a real attack. Python and a few other languages have already fixed their hashes, we have not.

Actually we need a new std::safe_hash<> I think, one explicitly prohibited from being a trivial hash. I'd personally like to see that become the default hash for unordered_map et al, and let the programmer choose std::hash where safe. BTW I think MSVC/Dinkumware *has* fixed their std::hash<>, theirs is several orders of magntitude more complex than the trivial hash libstdc++ uses. Indeed this leads to MSVC's apparent poor unordered_map performance when compared to GCC (about a 12-40% performance loss). Stephan can probably confirm or refute this claim of MSVC/Dinkumware having a safe hash though. It may just be he uses a well distributed hash instead of a trivial hash, not one cryptographically safe.

...

As I said,the hash was only one reason. There were others, IIRC. I do agree that unordered_map will become more popular in the future, so that the issue is lessened. (Also, map should have been called ordered_map, and unordered_map called 'map'...)

FYI my proposed concurrent_unordered_map is particularly collision resistant, and therefore much less prone to DDoS attacks. I use a linear table scan of 16 byte items to resolve collision, so it takes quite a while for that to become a serious performance issue. Note I didn't choose this design for DDoS attacks, rather due to the proposed design having an extremely costly rehash(), so much so that you'd put up with a lot of collision before expending the enormous cost of rehashing. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Stephan T. Lavavej

8:27 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

[Niall Douglas]

...

BTW I think MSVC/Dinkumware *has* fixed their std::hash<>, theirs is several orders of magntitude more complex than the trivial hash libstdc++ uses. Indeed this leads to MSVC's apparent poor unordered_map performance when compared to GCC (about a 12-40% performance loss). Stephan can probably confirm or refute this claim of MSVC/Dinkumware having a safe hash though. It may just be he uses a well distributed hash instead of a trivial hash, not one cryptographically safe.

VC currently uses FNV-1a for everything. This is good at jumbling up bits, and pretty fast (although not as fast as returning an integer unchanged), but it is absolutely not cryptographically secure, nor does it attempt to resist collisions triggered by an intelligent adversary. We reserve the right to change our hash implementation in the future. STL

Niall Douglas

8:44 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 2 Dec 2014 at 20:27, Stephan T. Lavavej wrote:

...

[Niall Douglas]

...
BTW I think MSVC/Dinkumware *has* fixed their std::hash<>, theirs is several orders of magntitude more complex than the trivial hash libstdc++ uses. Indeed this leads to MSVC's apparent poor unordered_map performance when compared to GCC (about a 12-40% performance loss). Stephan can probably confirm or refute this claim of MSVC/Dinkumware having a safe hash though. It may just be he uses a well distributed hash instead of a trivial hash, not one cryptographically safe.

VC currently uses FNV-1a for everything. This is good at jumbling up bits, and pretty fast (although not as fast as returning an integer unchanged), but it is absolutely not cryptographically secure, nor does it attempt to resist collisions triggered by an intelligent adversary.

We reserve the right to change our hash implementation in the future.

Thanks for the confirm Stephan. Out of curiosity I went and looked up what Python did to solve this (here's the discussion: http://bugs.python.org/issue13703). In the end, they simply XOR salted the hash with a cryptographically generated random number produced at process launch (source: https://docs.python.org/3/reference/datamodel.html#object.__hash__) and carried on with their previous algorithm. I think std::hash could be similarly adjusted with ease, the hard part is generating a cryptographically secure random number. I don't believe C++ 11 even has such a secure generator in <random> even ... nevertheless, the C++1z standard might consider adding a static function to the STL which sets the std::hash salt value to something. It might be an idea to only permit that function to be called exactly once. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Stephan T. Lavavej

9:05 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

[Niall Douglas]

...

the hard part is generating a cryptographically secure random number. I don't believe C++ 11 even has such a secure generator in <random> even ...

The Standard indirectly suggests that random_device should be such a generator, and VC guarantees this. STL

Niall Douglas

3 Dec 3 Dec

11:13 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 2 Dec 2014 at 21:05, Stephan T. Lavavej wrote:

...

[Niall Douglas]

...
the hard part is generating a cryptographically secure random number. I don't believe C++ 11 even has such a secure generator in <random> even ...

The Standard indirectly suggests that random_device should be such a generator, and VC guarantees this.

Out of curiosity, does VC thunk through to the Windows Crypto API, and therefore require that the environment variable block contains the right COM initialisation magic (I ran into this "feature" recently, it was confounding)? I am curious because libstdc++ apparently uses plain old RDRAND on x86 CPUs with support for that opcode. That surprised me. I would have thought that /dev/random was plenty, and moreover, considerably safer as kernel random number generators will get hot patched quickly. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Stephan T. Lavavej

6:03 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

[Niall Douglas]

...

Out of curiosity, does VC thunk through to the Windows Crypto API, and therefore require that the environment variable block contains the right COM initialisation magic (I ran into this "feature" recently, it was confounding)?

VC's random_device calls the CRT's rand_s which calls Windows' RtlGenRandom. There's no COM interaction, to my knowledge. (rand_s has a different codepath for Windows Store apps). STL

Niall Douglas

4 Dec 4 Dec

1:53 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 3 Dec 2014 at 18:03, Stephan T. Lavavej wrote:

...

...
Out of curiosity, does VC thunk through to the Windows Crypto API, and therefore require that the environment variable block contains the right COM initialisation magic (I ran into this "feature" recently, it was confounding)?

VC's random_device calls the CRT's rand_s which calls Windows' RtlGenRandom. There's no COM interaction, to my knowledge. (rand_s has a different codepath for Windows Store apps).

That is very useful to know, and actually solves a problem I have forthcoming at work. Thanks Stephan. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Howard Hinnant

2 Dec 2 Dec

10:40 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Dec 2, 2014, at 3:44 PM, Niall Douglas <s_sourceforge@nedprod.com> wrote:

...

Out of curiosity I went and looked up what Python did to solve this (here's the discussion: http://bugs.python.org/issue13703). In the end, they simply XOR salted the hash with a cryptographically generated random number produced at process launch (source: https://docs.python.org/3/reference/datamodel.html#object.__hash__) and carried on with their previous algorithm.

Here is a very short python script: https://131002.net/siphash/poc.py which appears to recover those random salts for the process. I tried it out on 2.7.5 and it appears to work really well. This comes from the SipHash website, which also contains short programs for compromising randomly seeded MurmurHash and CityHash. Howard

Niall Douglas

3 Dec 3 Dec

12:31 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 2 Dec 2014 at 17:40, Howard Hinnant wrote:

...

...
Out of curiosity I went and looked up what Python did to solve this (here's the discussion: http://bugs.python.org/issue13703). In the end, they simply XOR salted the hash with a cryptographically generated random number produced at process launch (source: https://docs.python.org/3/reference/datamodel.html#object.__hash__) and carried on with their previous algorithm.

Here is a very short python script:

https://131002.net/siphash/poc.py

which appears to recover those random salts for the process. I tried it out on 2.7.5 and it appears to work really well.

This is as expected, as Pythons before 3.3 require a special environment variable to be set to turn on the secure hash. Apparently many Python programs assume the order of the standard hash algorithm, so they couldn't just turn it on without years of warning about the change. I just tried your script on Python 3.4 and got 0 candidate solutions whereas 2.7 does yield solutions. I don't claim their chosen solution is foolproof, but Python probably does see a lot more untrusted inputs than probably C++ does. If there were a gaping security hole there, we would surely have heard about it. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Howard Hinnant

4:24 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Dec 2, 2014, at 7:31 PM, Niall Douglas <s_sourceforge@nedprod.com> wrote:

...

On 2 Dec 2014 at 17:40, Howard Hinnant wrote:

...
...
Out of curiosity I went and looked up what Python did to solve this (here's the discussion: http://bugs.python.org/issue13703). In the end, they simply XOR salted the hash with a cryptographically generated random number produced at process launch (source: https://docs.python.org/3/reference/datamodel.html#object.__hash__) and carried on with their previous algorithm.

Here is a very short python script:

https://131002.net/siphash/poc.py

which appears to recover those random salts for the process. I tried it out on 2.7.5 and it appears to work really well.

This is as expected, as Pythons before 3.3 require a special environment variable to be set to turn on the secure hash. Apparently many Python programs assume the order of the standard hash algorithm, so they couldn't just turn it on without years of warning about the change.

I just tried your script on Python 3.4 and got 0 candidate solutions whereas 2.7 does yield solutions. I don't claim their chosen solution is foolproof, but Python probably does see a lot more untrusted inputs than probably C++ does. If there were a gaping security hole there, we would surely have heard about it.

More info on why this attack failed on Python 3.4: http://lwn.net/Articles/574761/ https://www.python.org/dev/peps/pep-0456/ These articles claim that Python 3.4 and forward have simply adopted SipHash. For anyone interested, there is a hash_append-compatible implementation of the SipHash24 variant here, in the files siphash.h/siphash.cpp: https://github.com/HowardHinnant/hash_append You can experiment with this code without getting into hash_append by simply doing: #include <iostream> #include <random> #include <string> #include "siphash.h" int main() { // initialize random seeding std::random_device rd; std::mt19937_64 eng{rd()}; // initialize siphash object acme::siphash hash(eng(), eng()); // get something to hash std::string s("some data”); // hash it hash(s.data(), s.size()); auto key = static_cast<std::size_t>(hash); // output the hash std::cout << key << '\n'; } Howard

Niall Douglas

4 Dec 4 Dec

1:50 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 3 Dec 2014 at 11:24, Howard Hinnant wrote:

...

...
I just tried your script on Python 3.4 and got 0 candidate solutions whereas 2.7 does yield solutions. I don't claim their chosen solution is foolproof, but Python probably does see a lot more untrusted inputs than probably C++ does. If there were a gaping security hole there, we would surely have heard about it.

More info on why this attack failed on Python 3.4:

http://lwn.net/Articles/574761/ https://www.python.org/dev/peps/pep-0456/

These articles claim that Python 3.4 and forward have simply adopted SipHash.

Yes, you are correct. See http://lwn.net/Articles/574761/. Python's adoption of SipHash as their core hash function wins it for me, at least on 64 bit architectures only. Apparently it matches FNV-1 for longer runs of bytes too, only on short strings does it cost. Impressive.

...

For anyone interested, there is a hash_append-compatible implementation of the SipHash24 variant here, in the files siphash.h/siphash.cpp:

I think this sufficiently important to warrant a separate N-paper proposing a std::secure_hash<T> based on SipHash. Howard, you up for that? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Howard Hinnant

3:35 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Dec 4, 2014, at 8:50 AM, Niall Douglas <s_sourceforge@nedprod.com> wrote:

...

I think this sufficiently important to warrant a separate N-paper proposing a std::secure_hash<T> based on SipHash. Howard, you up for that?

Thanks, but no, I don’t think I can do that. SipHash isn’t the ultimate answer in hash security. It is the front line in a never ending war. SipHash arrived on the scene two years ago. I didn’t learn of it until a year ago. Many people are just learning of it now. We can’t possibly standardize it for another 3 years. The standardization process is way too slow for this kind of thing. In order to be effective, the standard needs to enable the programmer to choose the latest available weaponry. Because by the time we standardize algorithm X as the algorithm of choice in std::secure_hash<T>, odds are going to be good that it has already been attacked, and made obsolete by a better algorithm. The most I could back is offering SipHash as one of the customizable HashAlgorithms that could be used in hash_append. This is worlds different than std::secure_hash<T>, because when you author hash support for MyClass, std::secure_hash<MyClass> locks you in to a std-supplied algorithm, be it SipHash, or something else. That is bad. But when you instead write your hash support in terms of a hash_append which is templated on HashAlgorithm, then you can use SipHash, or any other hashing algorithm, without ever having to revisit MyClass’s hash support code again. Subsequently this means that when your unordered_set<MyClass> is attacked, you can do something about it (overnight!) without waiting years for a committee to come to your rescue. Howard

Niall Douglas

5 Dec 5 Dec

1:16 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 4 Dec 2014 at 10:35, Howard Hinnant wrote:

...

...
I think this sufficiently important to warrant a separate N-paper proposing a std::secure_hash<T> based on SipHash. Howard, you up for that?

Thanks, but no, I don´t think I can do that.

SipHash isn´t the ultimate answer in hash security. It is the front line in a never ending war. SipHash arrived on the scene two years ago. I didn´t learn of it until a year ago. Many people are just learning of it now. We can´t possibly standardize it for another 3 years.

The standardization process is way too slow for this kind of thing. In order to be effective, the standard needs to enable the programmer to choose the latest available weaponry. Because by the time we standardize algorithm X as the algorithm of choice in std::secure_hash<T>, odds are going to be good that it has already been attacked, and made obsolete by a better algorithm.

The most I could back is offering SipHash as one of the customizable HashAlgorithms that could be used in hash_append.

This is worlds different than std::secure_hash<T>, because when you author hash support for MyClass, std::secure_hash<MyClass> locks you in to a std-supplied algorithm, be it SipHash, or something else. That is bad. But when you instead write your hash support in terms of a hash_append which is templated on HashAlgorithm, then you can use SipHash, or any other hashing algorithm, without ever having to revisit MyClassTMs hash support code again.

Subsequently this means that when your unordered_set<MyClass> is attacked, you can do something about it (overnight!) without waiting years for a committee to come to your rescue.

You seem to be assuming that a std::secure_hash would be implemented inline. It need not be so, it could be implemented via an extern implementation in a shared library, and therefore an OS security update could push replacements. I'd assume that wouldn't fly as code depending on the ordering of an unordered map would break, so the committee would say no. So that brings us back to your alternative. Howard are there good reasons why your reference library for N3980 hasn't been submitted to Boost for review and inclusion? With something like my forthcoming BindLib a single code base can be dual use, so both compilable as a Boost module and as a totally independent standalone library. I'd also think that N3645 would swing more weight if already implemented and used within Boost. Long road to implementing that though. And we could do with better map STL container design anyway. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Howard Hinnant

3:59 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Dec 5, 2014, at 8:16 AM, Niall Douglas <s_sourceforge@nedprod.com> wrote:

...

Howard are there good reasons why your reference library for N3980 hasn't been submitted to Boost for review and inclusion? With something like my forthcoming BindLib a single code base can be dual use, so both compilable as a Boost module and as a totally independent standalone library.

Good question. Mainly lack of time on my part in packaging it. I’ll give that issue some more thought. Howard

Howard Hinnant

2 Dec 2 Dec

8:38 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On Dec 2, 2014, at 3:11 PM, Niall Douglas <s_sourceforge@nedprod.com> wrote:

...

Actually we need a new std::safe_hash<> I think, one explicitly prohibited from being a trivial hash. I'd personally like to see that become the default hash for unordered_map et al, and let the programmer choose std::hash where safe.

I’d like to see us enable the programmer to *easily* select hash algorithms we haven’t even heard of yet. E.g. perhaps they are being invented tomorrow. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3980.html https://www.youtube.com/watch?v=Njjp_MJsgt8&feature=youtu.be Howard

Niall Douglas

9:01 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 2 Dec 2014 at 15:38, Howard Hinnant wrote:

...

...
Actually we need a new std::safe_hash<> I think, one explicitly prohibited from being a trivial hash. I'd personally like to see that become the default hash for unordered_map et al, and let the programmer choose std::hash where safe.

I´d like to see us enable the programmer to *easily* select hash algorithms we haven´t even heard of yet. E.g. perhaps they are being invented tomorrow.

I didn't raise N3980 because I personally think it's orthogonal to the use cases for std::hash (and, for reference, I would be a *huge* supporter of N3980). I'd really like if it were possible to declare that for all unordered_map<> used in a namespace X that the default hash used is some type Y. One could evilly achieve this using a local namespace bind like this: namespace X { template<class Key, class T, class Hash=Y<Key>, ...> using unordered_map = std::unordered_map<Key, T, Hash, ...>; } ... but I'd personally prefer if every user of unordered_map were *required* to choose their hash via binding above, and the std implementation doesn't default a hash. I guess backwards compatibility will probably prevent that being possible sadly, unless we deprecate unordered_map for something better in the STL of course. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Gavin Lambert

11:17 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 2/12/2014 07:45, Gottlob Frege wrote:

...

On Sun, Nov 30, 2014 at 6:20 PM, Gavin Lambert <gavinl@compacsort.com> wrote:

...
On 29/11/2014 08:01, Gottlob Frege wrote:

...
There are still reasons to use std::map over unordered_map. Lack of a cryptographically safe hash is one of them. There are others (that I've forgotten, but I've asked the same question to committee members before, and there were a few reasons that sounded valid to me.)

Why should the lack of a cryptographically safe hash matter when you are not doing cryptography?

It doesn't really matter what hash is used in internal data structures. (Although obviously there are desirable properties such as having uniform spread to minimise bucket collisions and improve lookup speed.)

Denial of Service attack - I carefully force the input data such that the hashes collide and you get worse-case hash-table performance. This is a real attack. Python and a few other languages have already fixed their hashes, we have not.

So that still doesn't apply to every app, but it applies to more than you might expect.

Perhaps I'm just being naive, but I would think that there'd be more non-internet-facing apps than internet-facing ones (or at least the probability of any given application facing attack is fairly slim, apart from the really popularly installed ones), so most applications would prefer high-performance defaults over "secure" ones. Though I fully agree that it'd be nice to have a secure hash as an option, I don't see why eg. std::hash<size_t> (or any smaller type) doesn't just return the bitwise equivalent of the original value by default, as this has zero collision probability and the highest performance.

Stephan T. Lavavej

11:26 p.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

[Gavin Lambert]

...

I don't see why eg. std::hash<size_t> (or any smaller type) doesn't just return the bitwise equivalent of the original value by default, as this has zero collision probability and the highest performance.

Hashes are typically masked to get bucket indices, so hashing integers with the identity function allows collisions to be trivially generated, either intentionally or unintentionally. For example, 0x1000, 0x2000, 0x3000, 0x4000 are highly likely to collide, if the lower bits are being used to index buckets. STL

Niall Douglas

3 Dec 3 Dec

12:35 a.m.

New subject: [optional] operator<(optional<T>, T) -- is it wrong?

On 2 Dec 2014 at 23:26, Stephan T. Lavavej wrote:

...

...
I don't see why eg. std::hash<size_t> (or any smaller type) doesn't just return the bitwise equivalent of the original value by default, as this has zero collision probability and the highest performance.

Hashes are typically masked to get bucket indices, so hashing integers with the identity function allows collisions to be trivially generated, either intentionally or unintentionally. For example, 0x1000, 0x2000, 0x3000, 0x4000 are highly likely to collide, if the lower bits are being used to index buckets.

There are other big problems with trivial hashes too in that certain unfortunately common choices of bucket size happen to generate resonances with unfortunately common trivial hashes. I was tempted in concurrent_unordered_map to detect trivial hashes and deliberately subvert known terrible combinations of bucket size and trivial hashes by twiddling the bucket count away from what was requested, but then I remembered I am not a statistician. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

3862

Age (days ago)

3871

Last active (days ago)

List overview

Download

67 comments

14 participants

participants (14)

Andrzej Krzemienski
David Stone
Felix Uhl
Gavin Lambert
Gottlob Frege
Howard Hinnant
Marcel Raad
Matt Calabrese
Niall Douglas
Olaf van der Spek
Rob Stewart
Stephan T. Lavavej
Vicente J. Botet Escriba
Vladimir Batov