[system] Would it be possible to trial a breaking change to Boost.System and see what happens?

Niall Douglas

12 Jan 2018 12 Jan '18

5:43 p.m.

SG14 the low latency study has been looking into how to improve some design decisions in <system_error>. The relevant WG21 paper is https://wg21.link/P0824 and recent discussion can be found at https://groups.google.com/a/isocpp.org/forum/#!topic/sg14/j7tQybEjP5s. What we'd like to do is to test whether some very mildly breaking changes to <system_error> are so mild as to affect no users, and on that basis propose WG21 to make those mildly breaking changes to <system_error> in the next C++ standard. And to test this, we'd like to modify Boost.System with those mildly breaking changes, ship a release and see how many users complain. If it is zero, we have evidence for WG21 that this is not a consequential change. What we'd like to change is this: 1. Stop treating code zero as always meaning success irrespective of category. This remedies a defect where some custom error code domains cannot be represented by error_code due to using value 0 for an error. In actual code, I've never seen anyone ever use comparison to anything but a default constructed error code, so I think this will be safe. 2. Success becomes solely the default constructed error code, which is code zero and a new category of "null_category". This is internally represented by { 0, nullptr }, and thus makes the default constructor trivial which is highly desirable as it eliminates any compiler magic static fencing for the default constructor. Right now the default constructor uses system_category with value 0, and I suspect no code will notice this change either. 3, Make the default constructor constexpr, now possible. No code should notice a difference, except less runtime code will be generated. 4. error_code::message() returns a std::string_view instead of std::string if running C++ 17 or better. This lets us remove the <string> dependency, and thus stop dragging in half the STL with <system_error> which in turn makes <system_error> actually useful to the embedded systems crowd. The guess here is that std::string_view has such excellent interoperation with std::string that I think 99% of code will compile and work perfectly without anybody noticing a thing. I appreciate that this would undo Peter Dimov's excellent work at having Boost.System alias <system_error> under C++ 11 and later. I also appreciate that we are deliberately going to potentially break end user code. But my personal guess is that in practice, the breakage will not be noticed by 99% of code out there right now. It'll just compile against the improved Boost.System and everything will work as before. Boost was originally intended as an incubator for C++ standard changes. This ticks that box perfectly. Thoughts on feasibility? Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Show replies by date

Peter Dimov

12 Jan 12 Jan

5:52 p.m.

Niall Douglas wrote:

...

3, Make the default constructor constexpr, now possible.

See https://cplusplus.github.io/LWG/issue2992 Although the solution you propose is worth looking into in either case.

...

4. error_code::message() returns a std::string_view

How do you envisage this being implemented? A string_view into what?

Andrey Semashev

6:19 p.m.

On 01/12/18 20:43, Niall Douglas via Boost wrote:

...

SG14 the low latency study has been looking into how to improve some design decisions in <system_error>. The relevant WG21 paper is https://wg21.link/P0824 and recent discussion can be found at https://groups.google.com/a/isocpp.org/forum/#!topic/sg14/j7tQybEjP5s.

What we'd like to do is to test whether some very mildly breaking changes to <system_error> are so mild as to affect no users, and on that basis propose WG21 to make those mildly breaking changes to <system_error> in the next C++ standard. And to test this, we'd like to modify Boost.System with those mildly breaking changes, ship a release and see how many users complain. If it is zero, we have evidence for WG21 that this is not a consequential change.

What we'd like to change is this:

1. Stop treating code zero as always meaning success irrespective of category. This remedies a defect where some custom error code domains cannot be represented by error_code due to using value 0 for an error. In actual code, I've never seen anyone ever use comparison to anything but a default constructed error code, so I think this will be safe.

In my code I'm using contextual conversion to bool extensively, e.g. "if (err)" or "if (!err)". This is currently equivalent to comparing against 0, so I assume it will be broken by this change. So, you can count me right now as the one (loudly) complaining, among many others, I think.

...

2. Success becomes solely the default constructed error code, which is code zero and a new category of "null_category". This is internally represented by { 0, nullptr }, and thus makes the default constructor trivial which is highly desirable as it eliminates any compiler magic static fencing for the default constructor. Right now the default constructor uses system_category with value 0, and I suspect no code will notice this change either.

This breaks "category()" as it returns a reference now. Also, "message()" has to check for a null category now.

...

3, Make the default constructor constexpr, now possible. No code should notice a difference, except less runtime code will be generated.

Not sure there will be any difference in the generated code, since the constructor still has to initialize the int and the pointer. Optimizers are already able to propagate constants even if the code is not constexpr. The real benefit, it seems, is the ability to use error_code in constant expressions. I'm not sure how useful this is, I've never had a case where I needed this.

...

4. error_code::message() returns a std::string_view instead of std::string if running C++ 17 or better. This lets us remove the <string> dependency, and thus stop dragging in half the STL with <system_error> which in turn makes <system_error> actually useful to the embedded systems crowd. The guess here is that std::string_view has such excellent interoperation with std::string that I think 99% of code will compile and work perfectly without anybody noticing a thing.

This will only work if "error_category::message()" returns a string from a static storage. It will not allow relying on strerror_r or similar API, for example. This will make migration problematic if users define their own error categories that rely on external API like that.

...

I appreciate that this would undo Peter Dimov's excellent work at having Boost.System alias <system_error> under C++ 11 and later. I also appreciate that we are deliberately going to potentially break end user code. But my personal guess is that in practice, the breakage will not be noticed by 99% of code out there right now. It'll just compile against the improved Boost.System and everything will work as before.

Boost was originally intended as an incubator for C++ standard changes. This ticks that box perfectly.

Thoughts on feasibility?

Even besides that my code is likely to be broken by this, I think this is a really bad idea. Sure, Boost was concieved as the playground for the features that are potentially later included into the standard, but it is also a major C++ library that is widely used in the industry. Breaking users' code willy-nilly does not do any good neither to Boost nor to its users. So no, no breaking changes to see how bad its smells afterwards, please.

Peter Dimov

6:32 p.m.

Andrey Semashev wrote:

...

In my code I'm using contextual conversion to bool extensively, e.g. "if (err)" or "if (!err)". This is currently equivalent to comparing against 0, so I assume it will be broken by this change.

It will only be broken if you rely on `error_code( 0, some-category )` to contextually-convert to `false`. A default-constructed error_code, or one that's been cleared, will still be `false`. A principled approach would be to ask the category whether a code is `true` or `false`, but that might be complexity and overhead nobody needs.

Andrey Semashev

6:40 p.m.

On 01/12/18 21:32, Peter Dimov via Boost wrote:

...

Andrey Semashev wrote:

...
In my code I'm using contextual conversion to bool extensively, e.g. "if (err)" or "if (!err)". This is currently equivalent to comparing against 0, so I assume it will be broken by this change.

It will only be broken if you rely on `error_code( 0, some-category )` to contextually-convert to `false`.

I'm not sure how I can *not* rely on that since I construct `error_code` from enums or ints, all of them having 0 as the "success" value precisely to allow the contextual conversion produce the intended result.

...

A principled approach would be to ask the category whether a code is `true` or `false`, but that might be complexity and overhead nobody needs.

True.

Peter Dimov

6:46 p.m.

Andrey Semashev wrote:

...

...
It will only be broken if you rely on `error_code( 0, some-category )` to contextually-convert to `false`.

I'm not sure how I can *not* rely on that since I construct `error_code` from enums or ints, all of them having 0 as the "success" value precisely to allow the contextual conversion produce the intended result.

That's exactly what Niall wanted to know, I suppose; would it be possible for you to give specific examples from your code that would be broken?

Andrey Semashev

6:55 p.m.

On 01/12/18 21:46, Peter Dimov via Boost wrote:

...

Andrey Semashev wrote:

...
...
It will only be broken if you rely on `error_code( 0, some-category )` > to contextually-convert to `false`.

I'm not sure how I can *not* rely on that since I construct `error_code` from enums or ints, all of them having 0 as the "success" value precisely to allow the contextual conversion produce the intended result.

That's exactly what Niall wanted to know, I suppose; would it be possible for you to give specific examples from your code that would be broken?

I suppose, code like this: enum class my_errors { success = 0 }; void foo(error_code& err) { // Do stuff... err = make_error_code(success); // error_code(0, my_category) } void bar() { error_code err; foo(err); if (err) complain(); } I suppose, you could argue that I could avoid `err` initialization in `foo`, but that is a matter of taste and whether you want to rely on the initial state of `err` on entry into `foo`. I'm also not happy to have to convert that initialization to err = error_code(); because it loses information about the error category, which may be useful if I want to print `err` to log even if it is a success.

Peter Dimov

7:03 p.m.

Andrey Semashev wrote:

...

void foo(error_code& err) { // Do stuff... err = make_error_code(success); // error_code(0, my_category) } ... I suppose, you could argue that I could avoid `err` initialization in `foo`, but that is a matter of taste and whether you want to rely on the initial state of `err` on entry into `foo`.

No, I'm not going to argue that; it's idiomatic to clear `err` on entry: void foo( error_code& err ) { err.clear(); // do stuff } with the alternative err.assign( 0, err.category() ); being used nowadays to avoid the overhead in .clear caused by the "magic static" Niall talks about. This latter "idiom" is obviously broken after NIall's suggested change though.

...

I'm also not happy to have to convert that initialization to

err = error_code();

because it loses information about the error category, which may be useful if I want to print `err` to log even if it is a success.

Interesting. You rely on there being different success error_codes? This is also in direct opposition to the requested change. How would you suggest we solve the problem of zero being an error in some contexts?

Andrey Semashev

7:29 p.m.

On 01/12/18 22:03, Peter Dimov via Boost wrote:

...

Andrey Semashev wrote:

...
   void foo(error_code& err)    {      // Do stuff...      err = make_error_code(success); // error_code(0, my_category)    } ... I suppose, you could argue that I could avoid `err` initialization in `foo`, but that is a matter of taste and whether you want to rely on the initial state of `err` on entry into `foo`.

No, I'm not going to argue that; it's idiomatic to clear `err` on entry:

void foo( error_code& err ) {    err.clear();    // do stuff }

with the alternative

   err.assign( 0, err.category() );

being used nowadays to avoid the overhead in .clear caused by the "magic static" Niall talks about.

This latter "idiom" is obviously broken after NIall's suggested change though.

I don't think this matches my intent. The `foo` function belongs to the domain that defines error codes described by the `my_errors` enum and it is supposed to return one of those codes. Preserving whatever error category was set in `err` on entry does not achieve that. What I want is that `err` has `my_category` unconditionally on return from `foo`.

...

...
I'm also not happy to have to convert that initialization to

err = error_code();

because it loses information about the error category, which may be useful if I want to print `err` to log even if it is a success.

Interesting. You rely on there being different success error_codes?

Yes, mostly for diagnostic purposes. Although it may also be a useful feature to check the domain when comparing two error codes.

...

How would you suggest we solve the problem of zero being an error in some contexts?

Besides moving the check for the error code into the category? I suppose, `make_error_code` could perform some kind of mapping onto another enum that has success value of 0. It can be as simple as this: enum foreign_errors { error = 0, success = 1 }; enum class internal_errors { error = -1, success = 0 }; error_code make_error_code(foreign_errors ec) { return error_code(ec - 1, my_category); } error_code make_error_code(internal_errors ec) { return error_code(ec, my_category); } Another solution would be to make error category have a public member, which is cheap to access, that would return the "success" value. // std::error_category class error_category { int success_value; public: explicit error_category(int sv = 0) : success_value(sv) {} int get_success_value() const noexcept { return success_value; } }; class my_category final : public error_category { public: my_category() : error_category(1) {} }; bool error_code::operator bool() const { return value() != category().get_success_value(); }

Peter Dimov

8:38 p.m.

Andrey Semashev wrote:

...

I don't think this matches my intent. The `foo` function belongs to the domain that defines error codes described by the `my_errors` enum and it is supposed to return one of those codes. Preserving whatever error category was set in `err` on entry does not achieve that. What I want is that `err` has `my_category` unconditionally on return from `foo`.

In the general case, foo can call bar, which can also fail and return whatever error_code it likes. This is analogous to exceptions; mandating that functions from a specific domain only ever throw exceptions from that same domain quickly becomes burdensome.

Andrey Semashev

9:33 p.m.

On 01/12/18 23:38, Peter Dimov via Boost wrote:

...

Andrey Semashev wrote:

...
I don't think this matches my intent. The `foo` function belongs to the domain that defines error codes described by the `my_errors` enum and it is supposed to return one of those codes. Preserving whatever error category was set in `err` on entry does not achieve that. What I want is that `err` has `my_category` unconditionally on return from `foo`.

In the general case, foo can call bar, which can also fail and return whatever error_code it likes. This is analogous to exceptions; mandating that functions from a specific domain only ever throw exceptions from that same domain quickly becomes burdensome.

I wouldn't say so. If I'm writing a library I would make it throw exceptions that belong to it. If it uses e.g. Boost.Filesystem then in the parts where it is used Boost.Filesystem can throw its own exceptions, which may propagate to the user. But in no case would I be throwing Boost.Filesystem exceptions myself from my library. It's the same with error codes. The only difference is that we have to have a success error code as well.

Niall Douglas

13 Jan 13 Jan

12:11 a.m.

...

This latter "idiom" is obviously broken after NIall's suggested change though.

Maybe. It depends. If your category defines 0 as success, then if(err) will return false, as it should be. If your code does: err.assign(0, err.category()); if(err != error_code()) ... And if your category is system_category, then yes now you would be broken, because a default constructed error code would have a null category under my proposal. Not system_category as at present. One could dispense with null_category, and use system_category, but with an internal nullptr representation. My objection to that is that C++ 11 is assuming that on every possible OS out there, code 0 is always success. I find that to be presumptuous, and I think it should be fixed.

...

...
I'm also not happy to have to convert that initialization to

err = error_code();

because it loses information about the error category, which may be useful if I want to print `err` to log even if it is a success.

Interesting. You rely on there being different success error_codes? This is also in direct opposition to the requested change.

No, every code domain must be allowed to dictate what means success to it. For example, NTSTATUS code have two degrees of success, and two degrees of failure. The top two bits determine this, leaving 30 bits for values of each degree. An ntkernel_category would therefore define all codes with a value >= 0 as being success. Note that right now under the present design, we cannot represent this in <system_error>. NTSTATUS values meaning success which are not zero are seen as failures by error_code and C++ 11. This annoys me, because it's not right. (Readers may not be aware of my https://github.com/ned14/ntkernel-error-category)

...

How would you suggest we solve the problem of zero being an error in some contexts?

I think the same way as all non-negative values being a success. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

12:21 a.m.

Niall Douglas wrote:

...

One could dispense with null_category, and use system_category, but with an internal nullptr representation.

nullptr is not necessary. Read my LWG issue.

Niall Douglas

12:01 a.m.

...

...
In my code I'm using contextual conversion to bool extensively, e.g. "if (err)" or "if (!err)". This is currently equivalent to comparing against 0, so I assume it will be broken by this change.

It will only be broken if you rely on `error_code( 0, some-category )` to contextually-convert to `false`. A default-constructed error_code, or one that's been cleared, will still be `false`.

A principled approach would be to ask the category whether a code is `true` or `false`, but that might be complexity and overhead nobody needs.

I think it unavoidable myself (and hardly complex to implement). Only an error category knows if some value or other means success or not. There would be not much overhead if the category implementation marks the function used final, as I expect it would. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

12:20 a.m.

Niall Douglas wrote:

...

...
A principled approach would be to ask the category whether a code is `true` or `false`, but that might be complexity and overhead nobody needs.

I think it unavoidable myself (and hardly complex to implement). Only an error category knows if some value or other means success or not.

There would be not much overhead if the category implementation marks the function used final, as I expect it would.

I'm not sure it would matter much in the general case whether the category implementation has marked the function final or not; if the compiler doesn't know the dynamic type of error_code::category(), it would need to issue the virtual call.

Niall Douglas

12 Jan 12 Jan

11:58 p.m.

...

...
1. Stop treating code zero as always meaning success irrespective of category. This remedies a defect where some custom error code domains cannot be represented by error_code due to using value 0 for an error. In actual code, I've never seen anyone ever use comparison to anything but a default constructed error code, so I think this will be safe.

In my code I'm using contextual conversion to bool extensively, e.g. "if (err)" or "if (!err)". This is currently equivalent to comparing against 0, so I assume it will be broken by this change. So, you can count me right now as the one (loudly) complaining, among many others, I think.

Sorry, I forgot to mention that the category would be asked whether the error code's current value is success or failure (its's Friday, and my current contract is many hours drive from home. Sorry) This is unavoidable because system_category (on POSIX and Win32) and generic_category both define value = 0 as success. In some other error code domain, -1 might mean success. Or 9999. Up to the domain. No requirement that 0 be set aside as special, as currently.

...

...
2. Success becomes solely the default constructed error code, which is code zero and a new category of "null_category". This is internally represented by { 0, nullptr }, and thus makes the default constructor trivial which is highly desirable as it eliminates any compiler magic static fencing for the default constructor. Right now the default constructor uses system_category with value 0, and I suspect no code will notice this change either.

This breaks "category()" as it returns a reference now.

It would still return a reference, just as now. You would check for internal nullptr, if so return a static null_category.

...

Also, "message()" has to check for a null category now.

I don't think that much cost relative to constructing a std::string.

...

...
3, Make the default constructor constexpr, now possible. No code should notice a difference, except less runtime code will be generated.

Not sure there will be any difference in the generated code, since the constructor still has to initialize the int and the pointer. Optimizers are already able to propagate constants even if the code is not constexpr.

But it cannot elide the potential call to an extern function, the function which retrieves the error_category&. And therefore must emit code, just in case the system_category might be fetched. The present design of error_code generates code bloat when used by Outcome or Expected. Changing it as I describe stops forcing the compiler to emit code, and thus allows the compiler to fold more code during optimisation, thus emitting less assembler. Tighter, less bloaty code is the result.

...

The real benefit, it seems, is the ability to use error_code in constant expressions. I'm not sure how useful this is, I've never had a case where I needed this.

This is not a motivator for me at least. My issue is the side effects on code bloat. It annoys me when the fix is within easy reach.

...

...
4. error_code::message() returns a std::string_view instead of std::string if running C++ 17 or better. This lets us remove the <string> dependency, and thus stop dragging in half the STL with <system_error> which in turn makes <system_error> actually useful to the embedded systems crowd. The guess here is that std::string_view has such excellent interoperation with std::string that I think 99% of code will compile and work perfectly without anybody noticing a thing.

This will only work if "error_category::message()" returns a string from a static storage. It will not allow relying on strerror_r or similar API, for example. This will make migration problematic if users define their own error categories that rely on external API like that.

You are correct that the no 4 change is by far the most consequential proposed. I do not believe it affects end user code except for those who inherit from error_code and override message() - surely a very, very tiny number of people - but it sure as hell breaks everybody who implements a custom error code category if you change the category's message() signature. The number of people worldwide currently implementing their own error categories is likely low, and me personally speaking feel that their mild inconvenience is worth losing <string> from the header dependencies. But if backwards compatibility were felt to be super important, in error_code::message() you could try dynamic_cast to an error_category2 first in order to utilise an error_category2::message_view() function. If the dynamic_cast fails, you statically cache the strings returned by error_category::message() and return string_views of them. One could also, of course, simply drop modification no 4 entirely. Modifications 1 to 3 are likely safe to almost all existing C++. Only modification 4 breaks anything.

...

...
Boost was originally intended as an incubator for C++ standard changes. This ticks that box perfectly.

Thoughts on feasibility?

Even besides that my code is likely to be broken by this, I think this is a really bad idea. Sure, Boost was concieved as the playground for the features that are potentially later included into the standard, but it is also a major C++ library that is widely used in the industry. Breaking users' code willy-nilly does not do any good neither to Boost nor to its users. So no, no breaking changes to see how bad its smells afterwards, please.

This is an API breaking change being discussed. Code _might_ break beyond custom error categories. But I currently believe that code other than custom error categories _won't_ break. We are changing the API in ways that I would be very, very surprised if much real world code out there even notices. Besides, there is precedent for standard library features in C++ 11 which came from Boost continuing to be evolved in Boost, whilst retaining API compatibility, with said evolutions being mirrored by the C++ standard later. All code which compiles now will still compile with this modified Boost.System apart from custom error categories, this is not in doubt. The question is whether all code will still function exactly as before. This is what the trial would discover. Finally, Boost has broken API much more severely than this in the past as part of natural evolution (not always intentionally). If it's preannounced that we're doing this, and we do it, I don't see a problem here. Plenty of old APIs have been retired before. And besides, this is for testing a proposed change to a future C++ standard. That's a great reason to break API, if you're going to do it. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Andrey Semashev

13 Jan 13 Jan

1:09 p.m.

On 01/13/18 02:58, Niall Douglas via Boost wrote:

...

...
...
1. Stop treating code zero as always meaning success irrespective of category. This remedies a defect where some custom error code domains cannot be represented by error_code due to using value 0 for an error. In actual code, I've never seen anyone ever use comparison to anything but a default constructed error code, so I think this will be safe.

In my code I'm using contextual conversion to bool extensively, e.g. "if (err)" or "if (!err)". This is currently equivalent to comparing against 0, so I assume it will be broken by this change. So, you can count me right now as the one (loudly) complaining, among many others, I think.

Sorry, I forgot to mention that the category would be asked whether the error code's current value is success or failure (its's Friday, and my current contract is many hours drive from home. Sorry)

This is unavoidable because system_category (on POSIX and Win32) and generic_category both define value = 0 as success. In some other error code domain, -1 might mean success. Or 9999. Up to the domain. No requirement that 0 be set aside as special, as currently.

I appreciate that the special meaning of 0 may be undesirable, but having a virtual function call overhead whenever one wants to test if the error_code has a success value seems a too high price for that. I don't want to be paying that price in my Linux programs, sorry.

...

...
...
2. Success becomes solely the default constructed error code, which is code zero and a new category of "null_category". This is internally represented by { 0, nullptr }, and thus makes the default constructor trivial which is highly desirable as it eliminates any compiler magic static fencing for the default constructor. Right now the default constructor uses system_category with value 0, and I suspect no code will notice this change either.

This breaks "category()" as it returns a reference now.

It would still return a reference, just as now. You would check for internal nullptr, if so return a static null_category.

Why check in the first place if you could initialize the pointer to the global instance of the category?

...

...
Also, "message()" has to check for a null category now.

I don't think that much cost relative to constructing a std::string.

True, but it looks like an unnecessary cost nevertheless. (Note that a small string construction is very cheap and can fit in a few instructions, including the string contents initialization.)

...

...
...
3, Make the default constructor constexpr, now possible. No code should notice a difference, except less runtime code will be generated.

Not sure there will be any difference in the generated code, since the constructor still has to initialize the int and the pointer. Optimizers are already able to propagate constants even if the code is not constexpr.

But it cannot elide the potential call to an extern function, the function which retrieves the error_category&. And therefore must emit code, just in case the system_category might be fetched.

I don't think the current standard requires an external function call. The default constructor can obtain the pointer to the global category instance directly, if it is exported. `system_category()` can be an inline function as well.

...

The present design of error_code generates code bloat when used by Outcome or Expected. Changing it as I describe stops forcing the compiler to emit code, and thus allows the compiler to fold more code during optimisation, thus emitting less assembler. Tighter, less bloaty code is the result.

I assume, you mean the code that calls `system_category()` in runtime? I think, this is a quesion of QoI that can be solved as I described above without the need to break users' code or adding runtime overhead.

...

...
...
4. error_code::message() returns a std::string_view instead of std::string if running C++ 17 or better. This lets us remove the <string> dependency, and thus stop dragging in half the STL with <system_error> which in turn makes <system_error> actually useful to the embedded systems crowd. The guess here is that std::string_view has such excellent interoperation with std::string that I think 99% of code will compile and work perfectly without anybody noticing a thing.

This will only work if "error_category::message()" returns a string from a static storage. It will not allow relying on strerror_r or similar API, for example. This will make migration problematic if users define their own error categories that rely on external API like that.

You are correct that the no 4 change is by far the most consequential proposed. I do not believe it affects end user code except for those who inherit from error_code and override message() - surely a very, very tiny number of people - but it sure as hell breaks everybody who implements a custom error code category if you change the category's message() signature.

I don't have an estimate on how widespread custom error categories are, but I surely have a few. I know that several Boost libraries define their categories (namely, Boost.Filesystem, Boost.ASIO, Boost.Beast, Boost.Thread come to mind). Searching on GitHub[1] finds a few examples of custom categories (with a lot of noise from the comments though). Judging by that, I wouldn't say that custom categories are rare. [1]: https://github.com/search?l=C%2B%2B&q=error_category&type=Code&utf8=%E2%9C%93

...

The number of people worldwide currently implementing their own error categories is likely low, and me personally speaking feel that their mild inconvenience is worth losing <string> from the header dependencies. But if backwards compatibility were felt to be super important, in error_code::message() you could try dynamic_cast to an error_category2 first in order to utilise an error_category2::message_view() function. If the dynamic_cast fails, you statically cache the strings returned by error_category::message() and return string_views of them.

The `dynamic_cast` is yet more expensive than a null check, and it may be more expensive than constructing a small `std::string`. It may also be problematic on the embedded systems, which often have RTTI disabled. Caching the string in a static storage means thread safety issues, which will probably require TLS. This is a lot of complication and runtime overhead just to remove one #include.

...

This is an API breaking change being discussed. Code _might_ break beyond custom error categories.

But I currently believe that code other than custom error categories _won't_ break. We are changing the API in ways that I would be very, very surprised if much real world code out there even notices.

Code that relies on a specific error category in the default-constructed `error_code` will break.

...

Finally, Boost has broken API much more severely than this in the past as part of natural evolution (not always intentionally). If it's preannounced that we're doing this, and we do it, I don't see a problem here. Plenty of old APIs have been retired before. And besides, this is for testing a proposed change to a future C++ standard. That's a great reason to break API, if you're going to do it.

I don't think I remember an intentional major breakage of API without a deprecation period and a clear migration path. Such breakages are typically introduced as new libraries that live alongside with the old ones. Some parts of your proposal (namely, switching to `string_view`) do not have a clear migration path, especially for C++03 users. Other parts seem to be targeted at a rather niche use case but have high associated overhead that is going to be applied to all uses of `error_code`. And, on top of that, users' code may break, including silently. Sorry, but to me this doesn't look like a safe change that can be done to a widely used core library. Boost.System2, that will live side by side with the current Boost.System, seems more appropriate. I realize that you won't know how much of the world will be affected by the proposed changes.

charleyb123 .

3:39 p.m.

...

On 01/13/18 02:58, Niall Douglas wrote:

...
1. Stop treating code zero as always meaning success irrespective of

...
...
category. This remedies a defect where some custom error code domains cannot be represented by error_code due to using value 0 for an error. In actual code, I've never seen anyone ever use comparison to anything but a default constructed error code, so I think this will be safe.

In my code I'm using contextual conversion to bool extensively, e.g. "if (err)" or "if (!err)". This is currently equivalent to comparing against 0, so I assume it will be broken by this change. So, you can count me right now as the one (loudly) complaining, among many others, I think.

Sorry, I forgot to mention that the category would be asked whether the error code's current value is success or failure <snip>,

This is unavoidable because system_category (on POSIX and Win32) and generic_category both define value = 0 as success. In some other error code domain, -1 might mean success. Or 9999. Up to the domain. No requirement that 0 be set aside as special, as currently.

On Sat, Jan 13, 2018 at 6:09 AM, Andrey Semashev via Boost < boost@lists.boost.org> wrote:

...

I appreciate that the special meaning of 0 may be undesirable, but having a virtual function call overhead whenever one wants to test if the error_code has a success value seems a too high price for that. I don't want to be paying that price in my Linux programs, sorry.

My understanding is that the current <system_error> design explicitly enables many-to-many comparison of ("error-")values across error domains (error categories). Specifically, the implication is that there are "many" success-values; or "many" success-values specific to a given application-specific context. We've already had recent threads discussing how to handle the several "HTTP 2xx" return-values as all indicating "success", but none of them are of value "zero". (Answer: Case out the 'enum' values, or compare to a generic 'success' enum-value designed for that purpose.) So, we can discuss the "trap-value" of zero as meaning something, but it seems real-world algorithms must handle these multiple success-values (whether or not we want to expect a 'zero' as meaning something in particular). If we concede that many 'success' values exist, the class of 'std::error_code' values becomes: (a) empty, (e.g., default-ctor) (b) success (many possible values, context-determined) (c) error (many possible values, context-determined) Of course, some "errors" are fatal, and some are not. Does a non-fatal error become a warning? Must we explode the above list (quoting 'syslog') to: "emerg", "alert", "crit", "err", "warning", "notice", "info", "debug" ...? My guess is that there is no appetite for greater granularity. Regarding "less" granularity, there are two classes of 'std::error_code' values: (a) empty (e.g., default-ctor) (b) non-empty (an application filled-it-in with something, which may represent 'success' or 'failure') ...At which point, the caller receiving the 'std::error_code' instance must interrogate by comparing against an 'enum' (to determine some form of "success" or "failure"). This is the genius of the current-design, where the "heavy-lifting" of the many-to-many cross-domain mapping is done by one-or-more library authors that sensibly map those codes, and present the user with a manageable and discrete set of enums against which the 'std::error_code' value may be tested. IMHO, the 'if(ec)...' is a problematic construct if it is to imply some form of "success". It seems more suited to imply "no-value-populated"; but I think that is a very limited test that probably isn't very helpful in the real-world: If the caller receives the value from some implementation, the caller can trust it to represent something and doesn't care about the distinction between "empty" and "not-empty", but only between "success" and "not-success". The current <system_error> design mandates that the caller must handle the several forms of success that may be provided by the implementation, and the 'if(ec)' construct cannot do that: Only enum-testing can do that, where the 'enum' provides the context against which the test can be evaluated. But, I concede that people are using the 'if(ec)' now, to test for 'success', which behaves correctly in those cases where the implementation provides an "empty" value indicating "success". My opinion is that there is a single category of 'std::error_code' values: (a) something-or-nothing ...whereby an application must interrogate with an 'enum', which is the process of collapsing many-values (possibly across domains) for a boolean result, like one of:: if(ec == BADNESS_MUST_ABORT) ... if(ec == IS_OK_CAN_PROCEED) ... --charley

Peter Dimov

3:58 p.m.

charleyb123 . wrote: ...

...

My opinion is that there is a single category of 'std::error_code' values:

(a) something-or-nothing

...whereby an application must interrogate with an 'enum', which is the process of collapsing many-values (possibly across domains) for a boolean result, like one of::

if(ec == BADNESS_MUST_ABORT) ...

if(ec == IS_OK_CAN_PROCEED) ...

`if( ec == errc::success )` is an interesting solution to the NTSTATUS/HRESULT problem.

Niall Douglas

14 Jan 14 Jan

12:34 a.m.

...

`if( ec == errc::success )` is an interesting solution to the NTSTATUS/HRESULT problem.

Oh if we could redo Boost.System with hindsight ... but that's always the case with any standardised Boost library bar none. A decade or two always shows up things no human could have possibly predicted. My colleagues on SG14 are keen to design a <system_error2> which does not suffer from any of the problems in <system_error>. And that may be the case we take in the end anyway. I am actually in the unusual position of being one of the few voices to feel that the existing <system_error> is serviceable, and that any of the <system_error2> proposals I've seen so far do not contribute enough effect on the *runtime* code quality as to be worth going through the travails of WG21 standardisation for. I mean, sure, you can make a much nicer for C++ <system_error2> using all the hindsight we have now. But you're not going to be using error codes unless you really care about performance, in which case runtime codegen is what you really care about much more than C++ niceness. And current <system_error> generates pretty optimally fast codegen, even on MSVC with its fenced error category fetches. My proposal would eliminate most of the remaining badness that I am aware of. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

12:54 a.m.

Niall Douglas wrote:

...

...
`if( ec == errc::success )` is an interesting solution to the NTSTATUS/HRESULT problem.

Oh if we could redo Boost.System with hindsight ... but that's always the case with any standardised Boost library bar none.

The above actually works within the existing framework, there's no need to redo anything. But it's not perfect. It'd have been better if there were a success bit in the error_code so that we didn't have to ask the category each time.

Niall Douglas

1:23 a.m.

...

...
...
`if( ec == errc::success )` is an interesting solution to the > NTSTATUS/HRESULT problem.

Oh if we could redo Boost.System with hindsight ... but that's always the case with any standardised Boost library bar none.

The above actually works within the existing framework, there's no need to redo anything. But it's not perfect. It'd have been better if there were a success bit in the error_code so that we didn't have to ask the category each time.

Ah, I'd have defined operator bool to be `if(*this == std::errc::success)` if we could wind back the clock. Problem solved. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Andrey Semashev

13 Jan 13 Jan

4:40 p.m.

...

On Sat, Jan 13, 2018 at 6:09 AM, Andrey Semashev via Boost < boost@lists.boost.org> wrote:

...
I appreciate that the special meaning of 0 may be undesirable, but having a virtual function call overhead whenever one wants to test if the error_code has a success value seems a too high price for that. I don't want to be paying that price in my Linux programs, sorry.

My understanding is that the current <system_error> design explicitly enables many-to-many comparison of ("error-")values across error domains (error categories).

Specifically, the implication is that there are "many" success-values; or "many" success-values specific to a given application-specific context. We've already had recent threads discussing how to handle the several "HTTP 2xx" return-values as all indicating "success", but none of them are of value "zero". (Answer: Case out the 'enum' values, or compare to a generic 'success' enum-value designed for that purpose.)

So, we can discuss the "trap-value" of zero as meaning something, but it seems real-world algorithms must handle these multiple success-values (whether or not we want to expect a 'zero' as meaning something in particular).

If we concede that many 'success' values exist, the class of 'std::error_code' values becomes:

(a) empty, (e.g., default-ctor) (b) success (many possible values, context-determined) (c) error (many possible values, context-determined)

I do not view `error_code` as an object that can have an "empty" state. From my view it is always populated with some code, which happens to be 0 by default. That value also happens to indicate "success" in many domains, with other values indicating various degrees of failure. So I believe, only (b) and (c) are valid choices. For (a) you need `optional<error_code>`.

...

IMHO, the 'if(ec)...' is a problematic construct if it is to imply some form of "success". It seems more suited to imply "no-value-populated"; but I think that is a very limited test that probably isn't very helpful in the real-world: If the caller receives the value from some implementation, the caller can trust it to represent something and doesn't care about the distinction between "empty" and "not-empty", but only between "success" and "not-success".

The "if (err)" test makes very much sense if you just want to know if things are ok, whatever that means. In terms of HTTP errors, that would mean testing for a 2xx status code. And that usage seems to be the most frequent in my experience. If you want to distinguish between different specific results then you would probably test against the specific status codes, which would require the caller to be familiar with the domain of the error code. That is also ok and `error_code` allows that usage as well, but naturally I always try to isolate the problem domain in the component that reports the error so that the caller can operate on a more abstract level. The problem is that `error_code` does not allow for multiple "success" values and adding that support would penalize the cases when there is only one "success" value (which I still think are much more frequent). As much as I would like `error_code` to support multiple success values, I don't think it is worth the performance penalty of a virtual function call. Personally, I would go one of the following routes: - Map "success" values (e.g. all 2xx codes in case of HTTP) to 0 when `error_code` is constructed. Use an additional channel (external to `error_code`) to obtain the exact domain-specific error code, if one needs it. - Do not map success values to 0 but also never use contextual conversion to bool. Always test against the specific error/success codes. - Use a completely different type to store error codes (probably, as simple as an enum) and leave `error_code` alone. Use `error_code` only in domains where its design fits and use other tools where it doesn't. The last option would probably be my preferred one, although it leaks the problem domain to the caller. As an idea, maybe it would be better to introduce a more generic version of `error_code`, and express `error_code` through that class. Something along these lines: template< typename IsSuccessCodeFun > class basic_error_code { // All error_code implementation here public: explicit operator bool() const { return IsSuccessCodeFun{}(value()); } }; typedef basic_error_code< is_zero > error_code; typedef basic_error_code< is_2xx > http_error_code;

Niall Douglas

14 Jan 14 Jan

12:27 a.m.

...

I appreciate that the special meaning of 0 may be undesirable

It's not just undesirable. It's incorrect. The present design precludes any third party error code domain which does not treat 0 as being success, or which treats values in addition to 0 as success, from being wrapped into an error code. And that was one of the prime motivations behind the original design: to wrap existing third party error code domains, often from legacy C libraries, into C++.

...

, but having a virtual function call overhead whenever one wants to test if the error_code has a success value seems a too high price for that. I don't want to be paying that price in my Linux programs, sorry.

You have a completely misguided estimation of relative costs here. This argument has no basis in empirical reality. Also, it's prejudiced as well. Your use case never sees more than one success value, so you fret about the "extra cost" of an indirect jump for testing that. But in my use case in AFIO I have to insert extra if statements to work around the fact that there are multiple success values possible from calls to the NT kernel. So I'm getting loaded with overhead so you don't have to be, and given the whole original design purpose of Boost.System which was to wrap C error code domains into better C++, I don't find that fair. (The reason I quote "extra cost" of an indirect jump is that there is no extra cost on any recent out of order CPU including ARM and Intel. It's literally zero cost in the hot cache case because it's been speculatively dereferenced, there isn't even a pipeline stall. Haswell and later can even make a virtual function call immediately calling a virtual function zero overhead in the hot cache case, the speculative execution reaches two levels down most of the time in most circumstances)

...

...
It would still return a reference, just as now. You would check for internal nullptr, if so return a static null_category.

Why check in the first place if you could initialize the pointer to the global instance of the category?

It cannot currently be done portably on all current compilers. Peter's LWG defect resolution - as far as I understand it - needs some extra compiler magic in the form of Immortalize. Without that one cannot guarantee a single instance of a category anywhere in the process. And that is a hard requirement by the current C++ standard.

...

...
But it cannot elide the potential call to an extern function, the function which retrieves the error_category&. And therefore must emit code, just in case the system_category might be fetched.

I don't think the current standard requires an external function call. The default constructor can obtain the pointer to the global category instance directly, if it is exported. `system_category()` can be an inline function as well.

You are incorrect. The C++ standard imposes a hard requirement that all error categories cannot - ever - have more than one instance in a process. That rules out inline sources of error categories. If you don't implement that, including in custom error categories, bad things happen, specifically error condition testing stops working right.

...

I don't have an estimate on how widespread custom error categories are, but I surely have a few. I know that several Boost libraries define their categories (namely, Boost.Filesystem, Boost.ASIO, Boost.Beast, Boost.Thread come to mind). Searching on GitHub[1] finds a few examples of custom categories (with a lot of noise from the comments though). Judging by that, I wouldn't say that custom categories are rare.

But they are however rare enough, and moreover, extremely easy to upgrade. My intended proposal for WG21 is that when you compile code with C++ 23 or later, the new string_view signature takes effect. If you compile with C++ 20 or earlier, the string signature remains. Breaking API has been done before in the C++ standard library, hence the need for standard version switches.

...

[1]: https://github.com/search?l=C%2B%2B&q=error_category&type=Code&utf8=%E2%9C%93

In fairness, more than I expected.

...

...
But I currently believe that code other than custom error categories _won't_ break. We are changing the API in ways that I would be very, very surprised if much real world code out there even notices.

Code that relies on a specific error category in the default-constructed `error_code` will break.

I've never seen such code, and even if it is doing that, it should stop doing so. system_category depends exclusively on the host OS. Code which behaves like that would not be portable.

...

I don't think I remember an intentional major breakage of API without a deprecation period and a clear migration path. Such breakages are typically introduced as new libraries that live alongside with the old ones.

This is hardly a major break. Minor break I can agree with. Remember there will be a macro lets you switch on the old API. Remember the string_view stuff only turns on if being compiled with C++ 17 or later.

...

Some parts of your proposal (namely, switching to `string_view`) do not have a clear migration path, especially for C++03 users.

You didn't fully read my earlier email. There is no point using std::string_view with earlier than C++ 17. And boost::string_view doesn't have sufficient interop with std::string to substitute. The std::string_view message() is 100% C++ 17 or later only.

...

Other parts seem to be targeted at a rather niche use case but have high associated overhead that is going to be applied to all uses of `error_code`. And, on top of that, users' code may break, including silently. Sorry, but to me this doesn't look like a safe change that can be done to a widely used core library. Boost.System2, that will live side by side with the current Boost.System, seems more appropriate. I realize that you won't know how much of the world will be affected by the proposed changes.

Ultimately the decision lands with the maintainer of Boost.System, and it's what we're having this debate for. I'm not sure if that's still Beman entirely, I've noticed Peter sending some love in the direction of Boost.System recently. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

12:48 a.m.

Niall Douglas wrote:

...

Peter's LWG defect resolution - as far as I understand it - needs some extra compiler magic in the form of Immortalize.

No, you're reading that incorrectly, it requires no compiler magic, it's implementable today. In the typical case it's just extern const __system_category_impl __syscat; constexpr error_category const& system_category() { return __syscat; } When I proposed that, Billy O'Neal from MS objected that their system_category does extra "immortalization" so constexpr is unimplementable. So I gave a proof of concept of how their "immortalization" can be implemented to still be constexpr. I should probably just add the relevant constexpr's to Boost.System.

Niall Douglas

1:20 a.m.

...

...
Peter's LWG defect resolution - as far as I understand it - needs some extra compiler magic in the form of Immortalize.

No, you're reading that incorrectly, it requires no compiler magic, it's implementable today. In the typical case it's just

extern const __system_category_impl __syscat;

constexpr error_category const& system_category() { return __syscat; }

When I proposed that, Billy O'Neal from MS objected that their system_category does extra "immortalization" so constexpr is unimplementable. So I gave a proof of concept of how their "immortalization" can be implemented to still be constexpr.

I should probably just add the relevant constexpr's to Boost.System.

How do you guarantee that only one instance of an error category will ever exist in the process? If it's allowable to constexpr, I don't see how that's possible. I'd love to be corrected due to ignorance. MSVC's Immortalize is a directive to the VC runtime to ensure a single instance ever. It really cannot be made constexpr because it uses a runtime lookup table. At least, that's how STL explained it to me. You can see some godbolt on this at https://godbolt.org/g/3TtFsu. Search for std::_Immortalize<std::_System_error_category>. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

1:34 a.m.

Niall Douglas wrote:

...

How do you guarantee that only one instance of an error category will ever exist in the process?

If it's allowable to constexpr, I don't see how that's possible. I'd love to be corrected due to ignorance.

constexpr in this case doesn't change the number of instances that can exist in the process. The way to guarantee a single instance is to use the DLL/so runtime, and put the instance there.

...

MSVC's Immortalize is a directive to the VC runtime to ensure a single instance ever.

MS's _Immortalize is there to guarantee that the system_error machinery can be used during and after static destruction. That's why it's called "immortalize" - it makes the instance immortal, that is, makes sure its destructor is never invoked.

Niall Douglas

2:21 a.m.

...

...
How do you guarantee that only one instance of an error category will ever exist in the process?

If it's allowable to constexpr, I don't see how that's possible. I'd love to be corrected due to ignorance.

constexpr in this case doesn't change the number of instances that can exist in the process.

constexpr says that the compiler is allowed to assume that the instance in the current compilation unit is the sole one in the process. That lets it eliminate it. That's rather antithetical to the hard requirements in the standard. This is why I've discounted your LWG solution, without additional changes to permit multiple instances, I don't think your example solution can work. (My proposed additional changes is stop comparing by address, compare by category name via strcmp() instead. Yes this is much slower, but as you all may be gathering right now, low latency C++ isn't about minimum overhead, it's about predictable overhead. I don't care about spending a few predictable cycles here and there if I get more predictability)

...

The way to guarantee a single instance is to use the DLL/so runtime, and put the instance there.

Agreed. As I mentioned before, error categories are required to have a unique address in the process to work correctly.

...

...
MSVC's Immortalize is a directive to the VC runtime to ensure a single instance ever.

MS's _Immortalize is there to guarantee that the system_error machinery can be used during and after static destruction. That's why it's called "immortalize" - it makes the instance immortal, that is, makes sure its destructor is never invoked.

True. But it does more than that. If you link all your DLLs with the static MSVC runtime so each gets a copy of a system_category implementation, _Immortalize still ensures that only one instance ever appears to exist in the process. So each error category always gets its unique address in the process. At least, this is what I've been told. You may not be aware, but last year I had an extended conversion with STL over this because it was slowing Outcome down. STL took the view that MSVC implements the standard correctly, and libstdc++ does not. As annoyed as I was at the time about it, upon reflection STL is correct. Adhering completely to the C++ standard requirements mandates this behaviour. And yes, the standard is wrong on requiring this, <system_error> is fundamentally broken, you cannot use shared objects safely with <system_error> where you don't control what the end user might do. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

3:54 a.m.

Niall Douglas wrote:

...

constexpr says that the compiler is allowed to assume that the instance in the current compilation unit is the sole one in the process. That lets it eliminate it. That's rather antithetical to the hard requirements in the standard. This is why I've discounted your LWG solution, without additional changes to permit multiple instances, I don't think your example solution can work.

`constexpr` can mean different things in different contexts. If you declare a constexpr variable of a literal type, yes, the compiler can eliminate it. That's not quite the case here though. extern const __system_category_impl __syscat; constexpr error_category const& system_category() { return __syscat; } Here `__syscat` is neither constexpr nor is it of a literal type (it has a virtual destructor.) system_category() is constexpr because it returns a known symbolic address, &__syscat, which can be compared for equality with other known symbolic addresses, and the comparison is constexpr, that is, its result is a compile-time constant expression. If you try to do something at compile time with the actual value to which system_category() returned a reference though, you'll fail; the only compile-time constant part of it is the address. If you then declare constexpr error_code ec( 0, system_category() ); now that would be a constexpr variable of a literal type, and the compiler is allowed to assume that it's the same even if it appears twice, and is allowed to eliminate it. But that's another thing entirely, and does not affect whether __syscat is unique. To make this even more complicated, constexpr has a third meaning; when put on a constructor, it enables static initialization. This is what makes it possible for __syscat to be defined in a manner so that it's initialized before anything else.

Niall Douglas

11:01 p.m.

...

...
constexpr says that the compiler is allowed to assume that the instance in the current compilation unit is the sole one in the process. That lets it eliminate it. That's rather antithetical to the hard requirements in the standard. This is why I've discounted your LWG solution, without additional changes to permit multiple instances, I don't think your example solution can work.

`constexpr` can mean different things in different contexts. If you declare a constexpr variable of a literal type, yes, the compiler can eliminate it. That's not quite the case here though.

   extern const __system_category_impl __syscat;

   constexpr error_category const& system_category()    {        return __syscat;    }

Here `__syscat` is neither constexpr nor is it of a literal type (it has a virtual destructor.)

system_category() is constexpr because it returns a known symbolic address, &__syscat, which can be compared for equality with other known symbolic addresses, and the comparison is constexpr, that is, its result is a compile-time constant expression. If you try to do something at compile time with the actual value to which system_category() returned a reference though, you'll fail; the only compile-time constant part of it is the address.

If you then declare

   constexpr error_code ec( 0, system_category() );

now that would be a constexpr variable of a literal type, and the compiler is allowed to assume that it's the same even if it appears twice, and is allowed to eliminate it. But that's another thing entirely, and does not affect whether __syscat is unique.

To make this even more complicated, constexpr has a third meaning; when put on a constructor, it enables static initialization. This is what makes it possible for __syscat to be defined in a manner so that it's initialized before anything else.

I didn't believe you, so I wrote https://godbolt.org/g/E5aMD8. The key is that the extern const variable is not a reference, but is a value. Fair enough. You are correct. I have learned something new, and now I need to go adjust some implementation code. Thank you. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

4:30 p.m.

Niall Douglas wrote:

...

If you link all your DLLs with the static MSVC runtime so each gets a copy of a system_category implementation, _Immortalize still ensures that only one instance ever appears to exist in the process. So each error category always gets its unique address in the process. At least, this is what I've been told.

This was an interesting claim so I looked into it. No, categories from DLLs using the static runtime do not have the same address, and _Immortalize doesn't help with that. C:\Projects\test_system_error>bin\msvc-14.1\debug\runtime-link-static\single_instance_test.exe e1: 'system:0'; { 0, 560F6B00 } e2: 'system:0'; { 0, 57E86B00 } No errors detected. They do, however, compare equal. The equality comparison simply makes sure that all system categories compare equal: bool operator==(const error_category& _Right) const _NOEXCEPT { // compare categories for equality return (_Addr == _Right._Addr); } where /* constexpr */ error_category() _NOEXCEPT // TRANSITION { // default constructor _Addr = reinterpret_cast<uintptr_t>(this); } but protected: uintptr_t _Addr; enum : uintptr_t { // imaginary addresses for Standard error_category objects _Future_addr = 1, _Generic_addr = 3, _Iostream_addr = 5, _System_addr = 7 }; Perhaps we need to do something similar in Boost.System; the combination of link=static and runtime-link=shared is disallowed at Boost Jamroot level, but it could still help the "header-only in DLLs" case.

Andrey Semashev

6:30 p.m.

On 01/14/18 19:30, Peter Dimov via Boost wrote:

...

Niall Douglas wrote:

...
If you link all your DLLs with the static MSVC runtime so each gets a copy of a system_category implementation, _Immortalize still ensures that only one instance ever appears to exist in the process. So each error category always gets its unique address in the process. At least, this is what I've been told.

This was an interesting claim so I looked into it.

No, categories from DLLs using the static runtime do not have the same address, and _Immortalize doesn't help with that.

C:\Projects\test_system_error>bin\msvc-14.1\debug\runtime-link-static\single_instance_test.exe

e1: 'system:0'; { 0, 560F6B00 } e2: 'system:0'; { 0, 57E86B00 } No errors detected.

They do, however, compare equal.

The equality comparison simply makes sure that all system categories compare equal:

bool operator==(const error_category& _Right) const _NOEXCEPT    {    // compare categories for equality    return (_Addr == _Right._Addr);    }

where

/* constexpr */ error_category() _NOEXCEPT    // TRANSITION    {    // default constructor    _Addr = reinterpret_cast<uintptr_t>(this);    }

but

protected:    uintptr_t _Addr;

   enum : uintptr_t        {    // imaginary addresses for Standard error_category objects        _Future_addr = 1,        _Generic_addr = 3,        _Iostream_addr = 5,        _System_addr = 7        };

Perhaps we need to do something similar in Boost.System; the combination of link=static and runtime-link=shared is disallowed at Boost Jamroot level, but it could still help the "header-only in DLLs" case.

Interesting. It seems to contradict the definition of the `error_category::operator==` in the standard ([syserr.errcat.nonvirtuals]/2), which says it should compare `this` and `&rhs`. This is also reinforced by the note in [syserr.errcat.overview]/1. This enforced implementation seems unnecessary, though. We should probably consider changing it, either by marking it a defect or filing a proposal.

Peter Dimov

1:59 p.m.

...

I should probably just add the relevant constexpr's to Boost.System.

https://github.com/boostorg/system/commits/feature/constexpr

Niall Douglas

15 Jan 15 Jan

12:06 a.m.

On 14/01/2018 13:59, Peter Dimov via Boost wrote:

...

...
I should probably just add the relevant constexpr's to Boost.System.

https://github.com/boostorg/system/commits/feature/constexpr

Very nice Peter. Let's hope the committee replicate this into the standard. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

3:20 p.m.

...

...
I should probably just add the relevant constexpr's to Boost.System.

https://github.com/boostorg/system/commits/feature/constexpr

The only nontrivial change this entails is changing the member relational operators of error_category: class error_category { public: bool operator==(const error_category& rhs) const noexcept; bool operator!=(const error_category& rhs) const noexcept; bool operator<(const error_category& rhs) const noexcept; }; into nonmembers: constexpr bool operator==(const error_category& lhs, const error_category& rhs) noexcept; constexpr bool operator!=(const error_category& lhs, const error_category& rhs) noexcept; bool operator<(const error_category& lhs, const error_category& rhs) noexcept; For the first two, g++ complained that members of a non-literal class can't be constexpr (whereas clang had no problem with it.) The third one I moved out for consistency. Nonmembers seem more idiomatic in either case; can someone think of an argument against having these out of the class?

charleyb123 .

6 p.m.

On Mon, Jan 15, 2018 at 8:20 AM, Peter Dimov via Boost < boost@lists.boost.org> wrote:

...

...
I should probably just add the relevant constexpr's to Boost.System. https://github.com/boostorg/system/commits/feature/constexpr

The only nontrivial change this entails is changing the member relational operators of error_category:

class error_category { public:

bool operator==(const error_category& rhs) const noexcept; bool operator!=(const error_category& rhs) const noexcept; bool operator<(const error_category& rhs) const noexcept; };

into nonmembers:

constexpr bool operator==(const error_category& lhs, const error_category& rhs) noexcept; constexpr bool operator!=(const error_category& lhs, const error_category& rhs) noexcept; bool operator<(const error_category& lhs, const error_category& rhs) noexcept;

For the first two, g++ complained that members of a non-literal class can't be constexpr (whereas clang had no problem with it.) The third one I moved out for consistency.

Nonmembers seem more idiomatic in either case; can someone think of an argument against having these out of the class?

Not to get into the weeds, but: (a) I don't care where relational operators are defined, but (b) Users should not be encouraged to override implementations of relational operators themselves The design should/does allow for "equality/equivalence" override for user-custom domains, where that single (user-provided) implementation is used by the relational operators, such as through: *- virtual bool std::error_category::equivalent(int, const std::error_condition&) const noexcept; *- virtual bool std::error_category::equivalent(const std::error_code&,int) const noexcept; I realize you're talking about equivalence on the 'std::error_category' itself, so std::-provided relational operators anywhere are fine (the user should never override). If we (someday) wanted user-explicit override for comparing 'std::error_category' instances, we should provide that single function that the user can re-implement (which is then used by the relational operators themselves). Thus, it seems that the relational operators could go anywhere, and nobody should care. Summary: (1) I want the 'constexpr' (2) Its fine with me if relational operators are no longer member functions of 'std::error_category' (3) Ditto for 'std::error_code' and 'std::error_condition' (4) I don't think users should provide custom implementations of relational operators (customization is elsewhere) If we were to venture "into the weeds", the question would become: *- is comparison *exact-match* or *semantically-equivalent*? (i.e., are two separate categories permitted to "compare-equal"?) Possible rationalizations: (1) Two separate 'std::error_category' instances may be semantically the same category, but different instances (and "could' compare equal). This would also "smooth-over" the singleton 'std::error_category' ODR issue. (2) I've seen "enum-hierarchies" for error codes, where the "subset-categories" actually merely represent subsets within the parent category. Those sub-categories could compare equal to the parent-category, because they literally represent windows into the same parent-category. But of course, the current design is "identity" comparison for 'std::error_category'; and I want 'constexpr' more than I would want semantic equivalence. I have the opposite opinion for 'std::error_code', where I *always* want semantic equivalence, and *never* exact-match testing (where the current design tests for exact-match 'code==code' and semantic compare for 'code==condition'). --charley

Christopher Kohlhoff

14 Jan 14 Jan

4:05 a.m.

Hi Niall,

...

On 14 Jan 2018, at 11:27 am, Niall Douglas via Boost <boost@lists.boost.org> wrote:

...
I appreciate that the special meaning of 0 may be undesirable

It's not just undesirable. It's incorrect. The present design precludes any third party error code domain which does not treat 0 as being success, or which treats values in addition to 0 as success, from being wrapped into an error code.

I have only skimmed this thread, but you appear to be operating under a misconception. The error_code class itself deliberately does *not* imbue zero values with the meaning 'success' and non-zero values with the meaning 'failure'. An error_code simply represents an integer and a category, where the category identifies the source of a particular integer value. The specification of the error_code class carefully avoids making any judgement as to whether a particular value represents success or failure. The construct: if (ec) ... does not, in and of itself, mean 'if an error ...'. Instead, operator bool is specified to behave as the ints do, and the above construct should simply be read as 'if non-zero ...'. (N.B. I view the existence of boost::system::errc::success as vestigial. The enumerator does not appear in std::errc.) Instead, the correspondence of particular error_code values to success or failure is context specific and is defined by an API. The filesystem and networking libraries do define zero to mean success, partly because they are specified in terms of POSIX operations that do likewise. When defining your own API you are free to define your own notion of success or failure. One way would be to define your own error_condition for this (an intended use case for error_condition), but you may also use some other mechanism entirely (indicate failure via exception, return value, etc.). You might like to consider this approach for your own API that wraps NT kernel calls. I suspect you may be coming unstuck because (unless I am mistaken) the expected and outcome classes have baked in the assumption that zero means success and non-zero means failure. This isn't the case for error_code itself. [...]

...

My intended proposal for WG21 is that when you compile code with C++ 23 or later, the new string_view signature takes effect. If you compile with C++ 20 or earlier, the string signature remains.

The use of string_view is a non-starter. It has unclear ownership semantics and does not cater to error sources that compose their messages on demand (e.g. messages that are obtained from another API function, read from a file-based catalog, or constructed based on an error code value that happens to be a bitmask). Yes it would be nice to find a solution for error messages in freestanding environments that lack std::string, but string_view isn't it. Cheers, Chris

Peter Dimov

5:02 a.m.

Christopher Kohlhoff wrote:

...

The error_code class itself deliberately does *not* imbue zero values with the meaning 'success' and non-zero values with the meaning 'failure'. An error_code simply represents an integer and a category, where the category identifies the source of a particular integer value. The specification of the error_code class carefully avoids making any judgement as to whether a particular value represents success or failure. The construct:

if (ec) ...

does not, in and of itself, mean 'if an error ...'. Instead, operator bool is specified to behave as the ints do, and the above construct should simply be read as 'if non-zero ...'. ... Instead, the correspondence of particular error_code values to success or failure is context specific and is defined by an API.

I don't agree. In the general case, void f( error_code& ec ) { ec.clear(); do_f1( ec ); if( !ec ) return; do_f2( ec ); if( !ec ) return; do_f3( ec ); if( !ec ) return; } `f` should not need to know what errors are being returned by `do_f1`, `do_f2`, `do_f3`, in order to check whether the calls succeeded; the whole point of using error_code is that errors from different domains can be handled in a generic manner. This is similar to exceptions; `f` needs not know what callees can throw. Today, `do_f1` is implemented with backend A, so it throws A-flavored exceptions; tomorrow, it switches to backend B and throws B-flavored exceptions. The logic in `f` doesn't change. In the same way, whether `do_f1` returns error codes of category A or category B should not matter for the logic in `f`. There must exist a generic way to test for failure. If !ec is not it, well, then !ec shouldn't be used and we need something that is and should be.

Christopher Kohlhoff

7:16 a.m.

Hi Peter,

...

On 14 Jan 2018, at 4:02 pm, Peter Dimov via Boost <boost@lists.boost.org> wrote:

Christopher Kohlhoff wrote:

...
Instead, the correspondence of particular error_code values to success or failure is context specific and is defined by an API.

I don't agree. In the general case,

void f( error_code& ec ) { ec.clear();

do_f1( ec ); if( !ec ) return;

do_f2( ec ); if( !ec ) return;

do_f3( ec ); if( !ec ) return; }

`f` should not need to know what errors are being returned by `do_f1`, `do_f2`, `do_f3`, in order to check whether the calls succeeded; the whole point of using error_code is that errors from different domains can be handled in a generic manner.

No, this is just one approach to error handling, and defining the error handling approach is beyond error_code's scope. The responsibility of error_code is simply to encapsulate the error values that originate from different domains.

...

This is similar to exceptions;

On the contrary, one important reason for using error_code (I won't say the whole point, because it isn't) is that it enables error handling that is fundamentally different to exceptions. With exceptions, the notion of success or failure is determined by the callee. With error_codes, we also allow success or failure to be determined by the caller. This is existing practice not just with error_code but with the error code systems that it wraps. For example: HANDLE h = CreateMutex(0, TRUE, name); if (h == NULL) // Failure. else if (GetLastError() == ERROR_ALREADY_EXISTS) // Success or failure depends on the context. else // Success. (Or is it? Maybe detecting the existing mutex is what was desired.)

...

There must exist a generic way to test for failure.

Why must this exist? (With the emphasis placed on "generic".) You can achieve this within certain domains and error handling designs, sure. However, testing for failure using only the error code is not possible for all use cases where error codes are applicable or, indeed, already used. Cheers, Chris

Peter Dimov

1:47 p.m.

Christopher Kohlhoff wrote:

...

...
There must exist a generic way to test for failure.

Why must this exist? (With the emphasis placed on "generic".)

For the reason I stated; so that one can write a function whose logic is not broken when a callee switches (or is switched, or is ported) to another implementation and starts returning error codes from a different domain. And similarly, so that one can use several helper functions in one's implementation, each returning codes from a different domain, without the caller being hopelessly confused by it. That's what "encapsulate" means - to achieve a degree of independence. Having a generic way does not in any way preclude one to be able to (re)interpret success/failure in the concrete case, when the errors returned are of a specific fixed domain. That's a separate use case and it won't go away. A generic success condition is conceptually not any different from a generic "not found" condition. If you have domain-specific knowledge about what is a success, you won't compare to generic success, and if you have domain-specific knowledge about what wasn't found, you won't compare to generic not found. And consequently, an argument against the former is also an argument against the latter.

Christopher Kohlhoff

15 Jan 15 Jan

1:27 p.m.

Hi Peter,

...

On 15 Jan 2018, at 12:47 am, Peter Dimov via Boost <boost@lists.boost.org> wrote:

Christopher Kohlhoff wrote:

...
...
There must exist a generic way to test for failure.

Why must this exist? (With the emphasis placed on "generic".)

For the reason I stated; so that one can write a function whose logic is not broken when a callee switches (or is switched, or is ported) to another implementation and starts returning error codes from a different domain.

This is not logical. Part of being able to switch backends is that both backends adhere to a given specification. In the error handling model in your example, this specification would stipulate that on failure a function produces a non-zero-valued error code. Producing the errors from a different domain doesn't change that fact. However, this property is independent of the error_code itself; it is a property of your specification.

...

And similarly, so that one can use several helper functions in one's implementation, each returning codes from a different domain, without the caller being hopelessly confused by it. That's what "encapsulate" means - to achieve a degree of independence.

Your example also follows an error handling model where, following a given function call, you test the error code before anything else to determine success or failure. My view is that the error_code class should be (and is) usable wherever one currently uses platform errors such as errno or GetLastError. The system calls that use these platform errors, by and large, require you to test the result first to determine success or failure and in some cases return an error code on success. The widespread use of this approach is why I assert that the requirement is not "generic". I don't intend to pass judgement here on the design merits of one approach or the other, but simply note that both exist and each have their pros and cons.

...

Having a generic way does not in any way preclude one to be able to (re)interpret success/failure in the concrete case, when the errors returned are of a specific fixed domain. That's a separate use case and it won't go away.

A generic success condition is conceptually not any different from a generic "not found" condition. If you have domain-specific knowledge about what is a success, you won't compare to generic success, and if you have domain-specific knowledge about what wasn't found, you won't compare to generic not found. And consequently, an argument against the former is also an argument against the latter.

Originally you phrased it as a generic way to test for failure, but I think I mostly agree with you if you instead phrase it in terms of success (as you are now doing). Specifically, in a test-the-error-first approach, you can consider a zero-valued error_code as an *unqualified* success, and all error categories must adhere to this to enable this approach. This satisfies the requirements of your example and does not preclude the existence of other error codes that represent *qualified* success and require additional contextual information to assess. This is essentially existing practice, so I think I could support codifying this much in the standard if you think it worthwhile. (The current plan was just to present it as a best practice in a paper I am writing.) Cheers, Chris

Peter Dimov

2:53 p.m.

Christopher Kohlhoff wrote:

...

...
...
Why must this exist? (With the emphasis placed on "generic".)

For the reason I stated; so that one can write a function whose logic is not broken when a callee switches (or is switched, or is ported) to another implementation and starts returning error codes from a different domain.

This is not logical. Part of being able to switch backends is that both backends adhere to a given specification. In the error handling model in your example, this specification would stipulate that on failure a function produces a non-zero-valued error code. Producing the errors from a different domain doesn't change that fact. However, this property is independent of the error_code itself; it is a property of your specification.

This is circular reasoning. My specification would return success when the function succeeds. How would it return success is determined by what is the established way to return success. The current de-facto standard way of returning success is zero, so I'd specify my function to return that. Were the standard way to signal success something else, my specification would reflect that something else.

...

Originally you phrased it as a generic way to test for failure, but I think I mostly agree with you if you instead phrase it in terms of success (as you are now doing).

These two ways to phrase it are the same; generic success and generic failure to succeed are complementary. Today, the logic is expressed in terms of if(ec) and if(!ec); whether the function tests for failure do_f1( ec ); if( ec ) return; do_f2( ec ); or for success do_f1( ec ); if( !ec ) do_f2( ec ); makes no difference.

Gavin Lambert

10:39 p.m.

On 16/01/2018 03:53, Peter Dimov wrote:

...

This is circular reasoning. My specification would return success when the function succeeds. How would it return success is determined by what is the established way to return success. The current de-facto standard way of returning success is zero, so I'd specify my function to return that. Were the standard way to signal success something else, my specification would reflect that something else.

Actually there are several competing "standards" for return values in general. Functions that can only ever return success or failure tend to return 1 for success and 0 for failure (as booleans; literally these values in the case of C APIs but preferably as bool-typed values in C++). Functions that can return one kind of success and several kinds of failure tend to standardise on 0 == success, anything else == failure. (Usually inspired by shell conventions.) Occasionally this convention is abused by having some of the kinds of failure actually a qualified success, for things like "you already own that mutex" or "the file you were trying to delete didn't actually exist". Functions that are expected to return several kinds of success as well as several kinds of failure tend towards having >0 as success, <0 as failure, with 0 as an in-between value that could be either depending on context. Although implementations vary on whether the return value is actually signed, with -1 the first error code, or whether it's sign-magnitude, with 0x80000001 the first error code. Sometimes other high bits encode other special conditions like warnings or an error category. Outcome resolves the latter case by returning a success-or-error type that separately encodes each state, so that error_code itself doesn't have to worry about success values -- but even in that case it might still need to represent recoverable errors or partial-success warnings where there wasn't any success result to return instead. APIs written in the first style generally seem more "natural" (though as always that depends on what you're used to), but become problematic to change to the second style if they need to return multiple error states in the future (since the success convention is flipped). Most commonly this is done by punting and retaining the same return format but then making error information available out-of-band via a thread-local get-last-error call or similar. The whole 0==success argument goes away if the operator bool does call the error_category and asks it if the code is equivalent to {0, generic_category()}. (Related: why does this not have a std::errc enum?) It also begs the question of whether error_code() should be changed to be {0, generic_category()} rather than {0, system_category()}, as this would more unambiguously represent success. Of course, even that change would probably break user code, unless more things did error_condition compares.

...

These two ways to phrase it are the same; generic success and generic failure to succeed are complementary. Today, the logic is expressed in terms of if(ec) and if(!ec); whether the function tests for failure

do_f1( ec ); if( ec ) return;

do_f2( ec );

The problem with this code is that if it is ignorant of the underlying implementation (and only has a zero-success convention) it can *only* test for complete success or not. There's no way to generically handle partial success (or a recoverable failure), unless you can rely on the error_condition mapping, which in turn relies on the underlying library happening to implement that in a way that's compatible with your expectations. (Woe betide you if you want to switch from library A to library B but one of them forgot to implement a mapping, or elected to map to different codes.) And these things are important -- the primary advantage of error_codes over exceptions is exactly for handling the recoverable cases without invoking the exception handling machinery. Exceptions are invariably superior for handling unrecoverable errors. If you do remap the errors to your own domain as soon as possible then that simplifies things as only the code directly adjacent to the external library needs to worry about which error codes to track, and anything higher up only needs to deal with your own error codes and your error condition mappings. But this inevitably leads to a loss of information about the original error cause if the error does end up bubbling up outside the library (or otherwise reaching a generic logging function).

Niall Douglas

14 Jan 14 Jan

11:54 p.m.

...

...
`f` should not need to know what errors are being returned by `do_f1`, `do_f2`, `do_f3`, in order to check whether the calls succeeded; the whole point of using error_code is that errors from different domains can be handled in a generic manner.

No, this is just one approach to error handling, and defining the error handling approach is beyond error_code's scope. The responsibility of error_code is simply to encapsulate the error values that originate from different domains.

I think that *was* the case in the past, and by individuals such as yourself Chris in the code that you personally have written. But Peter's right: the typical use case patterns for error_code have evolved into something different, more complex, more generic. And that's a good thing, even if currently there is lots of subtly broken code out there right now.

...

...
There must exist a generic way to test for failure.

Why must this exist? (With the emphasis placed on "generic".)

So templated code which composites unknown code can be useful. Right now it must assume "if(ec)" means "if error" due to no generic alternative.

...

You can achieve this within certain domains and error handling designs, sure. However, testing for failure using only the error code is not possible for all use cases where error codes are applicable or, indeed, already used.

Currently, yes. But a retrofitted error_code ought to work nicely. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Gavin Lambert

15 Jan 15 Jan

3:54 a.m.

On 14/01/2018 18:02, Peter Dimov wrote:

...

`f` should not need to know what errors are being returned by `do_f1`, `do_f2`, `do_f3`, in order to check whether the calls succeeded; the whole point of using error_code is that errors from different domains can be handled in a generic manner.

This is similar to exceptions; `f` needs not know what callees can throw. Today, `do_f1` is implemented with backend A, so it throws A-flavored exceptions; tomorrow, it switches to backend B and throws B-flavored exceptions. The logic in `f` doesn't change.

In the same way, whether `do_f1` returns error codes of category A or category B should not matter for the logic in `f`. There must exist a generic way to test for failure. If !ec is not it, well, then !ec shouldn't be used and we need something that is and should be.

I think Christopher's assertion works in a world where error_codes are never blindly propagated upwards; regardless of whether library X uses library A or library B under the hood, it must only return standard errors or library X errors to its external callers, never library A or B errors. (It's up to the library itself when to do the conversion, but it usually makes the most sense to do it as soon as possible.) I think the general consensus from the Outcome discussion was that this is not the desired practice and that error_codes are indeed supposed to be propagated unmodified for the most part. I haven't quite worked out yet where I stand on that; I think I'd prefer if the error codes were converted but it was still possible to obtain a "history" of an error, somewhat like nested exceptions. But I also recognise that that could have significant performance drawbacks.

Richard Hodges

9:25 a.m.

My 2c... Almost all the asio client code I have written in the past 4 years uses handlers like this: async_xxx(xxx, [self = this->shared_from_this()](auto&& ec) { if (ec) { self->xxx_error(ec); } else { self->please_carry_on(); } }); Because it seemed logically obvious that ec would be true if it contained an error. I did once notice that when writing my own boost::system category that a zero error code broke this, so made a mental note to avoid zero error codes in custom error schemes (e.g. rpc_error, database_error, message_bus_error etc.) So for my part, although I loathe the idea of changing interfaces, this interface seems broken and probably could do with being changed. Perhaps there should be a category of errors which is std::success and another std::success_with_warning? In this way, NT error codes could be propagated more accurately. Re outcome, I was a sceptic to being with (why would be wrap a variant<X, error_code>?) but now I see that it's preferable. Ideally there should be no system_error indicating success at all. That would ensure more robust logical handling of errors in all cases. I eagerly await the release of boost.outcome. I plan to retro-fit it into all my custom asio services. I am sure it will expose a number of subtle bugs. R On 15 January 2018 at 04:54, Gavin Lambert via Boost <boost@lists.boost.org> wrote:

...

On 14/01/2018 18:02, Peter Dimov wrote:

...
`f` should not need to know what errors are being returned by `do_f1`, `do_f2`, `do_f3`, in order to check whether the calls succeeded; the whole point of using error_code is that errors from different domains can be handled in a generic manner.

This is similar to exceptions; `f` needs not know what callees can throw. Today, `do_f1` is implemented with backend A, so it throws A-flavored exceptions; tomorrow, it switches to backend B and throws B-flavored exceptions. The logic in `f` doesn't change.

In the same way, whether `do_f1` returns error codes of category A or category B should not matter for the logic in `f`. There must exist a generic way to test for failure. If !ec is not it, well, then !ec shouldn't be used and we need something that is and should be.

I think Christopher's assertion works in a world where error_codes are never blindly propagated upwards; regardless of whether library X uses library A or library B under the hood, it must only return standard errors or library X errors to its external callers, never library A or B errors. (It's up to the library itself when to do the conversion, but it usually makes the most sense to do it as soon as possible.)

I think the general consensus from the Outcome discussion was that this is not the desired practice and that error_codes are indeed supposed to be propagated unmodified for the most part.

I haven't quite worked out yet where I stand on that; I think I'd prefer if the error codes were converted but it was still possible to obtain a "history" of an error, somewhat like nested exceptions. But I also recognise that that could have significant performance drawbacks.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost

Niall Douglas

14 Jan 14 Jan

11:42 p.m.

Chris, am I correct that this is your first post to boost-dev in nearly a decade? If so, welcome back and it's great to see you here again.

...

I have only skimmed this thread, but you appear to be operating under a misconception.

The error_code class itself deliberately does *not* imbue zero values with the meaning 'success' and non-zero values with the meaning 'failure'. An error_code simply represents an integer and a category, where the category identifies the source of a particular integer value. The specification of the error_code class carefully avoids making any judgement as to whether a particular value represents success or failure. The construct:

if (ec) ...

does not, in and of itself, mean 'if an error ...'. Instead, operator bool is specified to behave as the ints do, and the above construct should simply be read as 'if non-zero ...'.

Correct. And I don't think anyone on boost-dev who witnessed the Outcome v1 review is in doubt on the literal meaning of operator bool in error_code. But that's not what we're discussing really. Rather, I think that there is widespread agreement here that most code out there working with error_code has been written by programmers who when they write: if (ec) ... ... do think that they are writing "if there is an error, then ...". That is quite distinct from "if ec is not empty, then ..." and very distinct from "if ec's value is not zero, then ...". What I are discussing here is whether to do something about the common mispractice or not i.e. whether, and how much, to change or even break the system_error API so the incorrectly written code becomes correct.

...

Instead, the correspondence of particular error_code values to success or failure is context specific and is defined by an API. The filesystem and networking libraries do define zero to mean success, partly because they are specified in terms of POSIX operations that do likewise.

Sure. But you can surely see how sloppy usage has entered widespread practice here? If all current uses of error_code up until now just happen to work via "if(ec) ...", then nobody has reason to assume otherwise. We also are mindful that Outcome/Expected is going to make this problem much worse because if function A calls functions X, Y and Z, each of which return error codes with a custom category, then function A may return an error code of up to four different categories. Each of which may have its own criteria for what success or failure means. This thus turns into a problem of composition. If you have many libraries and many APIs all using custom error code categories, the end user is currently likely to experience surprise at times through no fault of their own, because some programmer in some layer in some library wrote "if(ec) ..." thinking "if there is an error, then ...". This is what I am trying to avoid, by arguing for a retrofit of error_code to make it fit what programmers think it means, but currently doesn't.

...

When defining your own API you are free to define your own notion of success or failure. One way would be to define your own error_condition for this (an intended use case for error_condition), but you may also use some other mechanism entirely (indicate failure via exception, return value, etc.). You might like to consider this approach for your own API that wraps NT kernel calls.

Unnecessary, every API returns an outcome::result<T>.

...

I suspect you may be coming unstuck because (unless I am mistaken) the expected and outcome classes have baked in the assumption that zero means success and non-zero means failure. This isn't the case for error_code itself.

Actually one of the main motivations behind Outcome was to rid the community of ever having to work directly with error_code again. It's highly error prone, very brittle. With Outcome/Expected that ease of unenforced errors caused by error_code is mostly mitigated. BTW, neither Outcome nor Expected have any assumptions about error_code for the simple reason that you can use any Error type you like e.g. std::string. As you'll see in the Outcome v2 review next week, it is in fact common in Outcome v2 to not use error_code for E, but rather a custom type with custom payload lazy convertible to an error_code. Both Expected and Outcome implement strict success/failure semantics, not the mishap prone wooliness of error_code. It's been a big win over the current practice. Outcome also eliminates the dual-API problem in Filesystem. We'll see if the community likes it (it was the v1 review which suggested the approach).

...

...
My intended proposal for WG21 is that when you compile code with C++ 23 or later, the new string_view signature takes effect. If you compile with C++ 20 or earlier, the string signature remains.

The use of string_view is a non-starter. It has unclear ownership semantics

I don't get the mental block people are having on this. The ownership obviously is by the error category from which it was sourced, so the storage needs to live as long as there is any instance of the category anywhere in the process. For most categories, it's a static const string table anyway living in the const part of the compiled binary.

...

and does not cater to error sources that compose their messages on demand (e.g. messages that are obtained from another API function, read from a file-based catalog, or constructed based on an error code value that happens to be a bitmask). Static hashmap cache.

...

Yes it would be nice to find a solution for error messages in freestanding environments that lack std::string, but string_view isn't it.

I'm not so bothered about the freestanding environment issue personally. I am bothered about the approx 2 seconds per compiland that including <string> adds if no other STL is being included. In a million file codebase, that's an extra 23 days of compute time per build. It's unacceptable. Outcome was very specifically designed to be used in the public interface header files of million file codebases, so for me eliminating the include of <string> is highly important. For the v2 Outcome review end of this week, I will be using <system_error> to make the review easier. But I do intend to eventually do something about the <string> include, probably some sort of lightweight subset of system_error for the standalone Outcome. It's why I opened this discussion here, I want to see if Boost will take the initiative on this and save me the hassle. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Peter Dimov

15 Jan 15 Jan

12:21 a.m.

Niall Douglas wrote:

...

I am bothered about the approx 2 seconds per compiland that including <string> adds if no other STL is being included. In a million file codebase, that's an extra 23 days of compute time per build. It's unacceptable.

<system_error> doesn't need to include <string>. A forward declaration is enough. User code that calls message() does need to include <string> of course. // <system_error> template<class E> class char_traits; template<class T> class allocator; template<class E, class T = char_traits<E>, class A = allocator<E>> class basic_string; typedef basic_string<char> string; class error_category { public: virtual string message( int ev ) const = 0; }; error_category const& system_category(); class error_code { private: int e_; error_category const* p_; public: error_code() noexcept: e_(0), p_(&system_category()) {} string message(); }; // <string> template<class E, class T, class A> class basic_string { }; // user code int main() { error_code().message(); } We can't do this in Boost.System because it's not possible to forward-declare std::string portably, but the standard library can. Incidentally, have you measured the inclusion time of <string_view>?

Niall Douglas

9:19 p.m.

...

Incidentally, have you measured the inclusion time of <string_view>?

It's approximately three times faster than <string>. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Roberto Hinz

9:37 p.m.

On Mon, Jan 15, 2018 at 7:19 PM, Niall Douglas via Boost < boost@lists.boost.org> wrote:

...

...
Incidentally, have you measured the inclusion time of <string_view>?

It's approximately three times faster than <string>.

'std::string_view' depends on 'std::char_traits' which is defined in <string> So in order to use 'std::string_view' member functions, isn't it necessary to include <string> ? roberto

Christopher Kohlhoff

1:28 p.m.

...

On 15 Jan 2018, at 10:42 am, Niall Douglas via Boost <boost@lists.boost.org> wrote: What I are discussing here is whether to do something about the common mispractice or not i.e. whether, and how much, to change or even break the system_error API so the incorrectly written code becomes correct.

Ok, to provide you with some data points for your original proposals:

...

1. Stop treating code zero as always meaning success irrespective of category. This remedies a defect where some custom error code domains cannot be represented by error_code due to using value 0 for an error. In actual code, I've never seen anyone ever use comparison to anything but a default constructed error code, so I think this will be safe.

This change is not "safe" and will break currently supported source code. "Fixing" this source code to support your scheme will introduce an additional branch (to intercept and translate error codes in the zero/success case) where none is currently required, adding code bloat and a runtime performance penalty.

...

4. error_code::message() returns a std::string_view instead of std::string if running C++ 17 or better.

...

...
The use of string_view is a non-starter. It has unclear ownership semantics

I don't get the mental block people are having on this. The ownership obviously is by the error category from which it was sourced, so the storage needs to live as long as there is any instance of the category anywhere in the process. For most categories, it's a static const string table anyway living in the const part of the compiled binary.

...
and does not cater to error sources that compose their messages on demand (e.g. messages that are obtained from another API function, read from a file-based catalog, or constructed based on an error code value that happens to be a bitmask).

Static hashmap cache.

Such a change will penalise existing, supported source code by requiring these applications to hold a very large number of strings in memory, if it is indeed feasible to hold that many at all. In these applications, which contain a number of error categories representing different error sources: * Some messages are obtained via other APIs' functions, which populate a buffer. Assuming the API itself holds these strings in memory (sometimes it is read from file), we are now duplicating the memory usage. * Some error codes are not a simple enumeration but a bitmask containing a number of different fields. The message is composed from these fields, and the number of combinations of these fields is large indeed. I note that there is established non-std::string practice which involves copying a string into a supplied buffer. This may or may not be an acceptable approach in the standard library, but unlike your string_view proposal it does not impose the high runtime complexity or memory requirements of a hashmap cache, still removes the dependency on <string>, and supports freestanding/embedded environments. I appreciate that you were unaware of these data points and the purpose of your original email was to gather information. I hope that this information enables the development of proposals that will not break or severely penalise existing, supported use cases.

Vinnie Falco

19 Jan 19 Jan

5:16 p.m.

On Sat, Jan 13, 2018 at 8:05 PM, Christopher Kohlhoff via Boost <boost@lists.boost.org> wrote:

...

The error_code class itself deliberately does *not* imbue zero values with the meaning 'success' and non-zero values with the meaning 'failure'. An error_code simply represents an integer and a category, where the category identifies the source of a particular integer value. The specification of the error_code class carefully avoids making any judgement as to whether a particular value represents success or failure. The construct:

if (ec) ...

does not, in and of itself, mean 'if an error ...'. Instead, operator bool is specified to behave as the ints do, and the above construct should simply be read as 'if non-zero ...'.

That might have been the original design intention, but that ship sailed a long time ago. It is well established that users are habituated to think of the bool conversion as meaning success or failure.

...

When defining your own API you are free to define your own notion of success or failure. One way would be to define your own error_condition for this (an intended use case for error_condition), but you may also use some other mechanism entirely (indicate failure via exception, return value, etc.). You might like to consider this approach for your own API that wraps NT kernel calls.

Nobody wants to write if(ec == my_condition::success) When then can instead write if(! ec) Even in your own code you write `if(! ec)`: <https://github.com/boostorg/asio/blob/6814d260d02300a97521c1a93d02e30877fb8ff5/example/cpp11/http/server/connection.cpp#L44> Whether it was intended or not, established practice is to treat the bool conversion of error_code as false==success and true==failure. I have long felt that the documentation for <error_code> and <error_condition> is terribly inadequate which probably accounts for the pervasive misconceptions. The standard is completely unhelpful in offering guidance on its use, and all of the usual websites (cppreference.com or the boost docs for example) are similarly unhelpful. Chris' blog posts on error_code were instructive but incomplete. Andrezj's blog posts did the best job of providing tutorial-like guidance but of course I read them too late. People are still trying to figure out how to use error_code and having trouble because it isn't well explained. Ideas like "if(! ec)" should not be used to check for success do nothing to improve this situation. Thanks

Robert Ramey

6:02 p.m.

On 1/19/18 9:16 AM, Vinnie Falco via Boost wrote:

...

Nobody wants to write

if(ec == my_condition::success)

LOL - I want to write this.

...

When then can instead write

if(! ec)

Even in your own code you write `if(! ec)`:

And when I do I want to consider it a mistake.

...

<https://github.com/boostorg/asio/blob/6814d260d02300a97521c1a93d02e30877fb8ff5/example/cpp11/http/server/connection.cpp#L44>

Whether it was intended or not, established practice is to treat the bool conversion of error_code as false==success and true==failure.

It may be established, but it's not good practice. It's an example where implicit conversion SEEMS to make things simpler and more expressive, but it ends up making things more error prone.

...

I have long felt that the documentation for <error_code> and <error_condition> is terribly inadequate which probably accounts for the pervasive misconceptions.

Agreed. Personally I can never remember the details of any of these things and I'm constantly going back to lookup how they are to be used. In the case of system_error - I like it, but I have to go back to Chris's web page every time to remember the details on how to use it. It's too clever for me to remember.

...

The standard is completely unhelpful in offering guidance on its use, and all of the usual websites (cppreference.com or the boost docs for example) are similarly unhelpful.

Agreed.

...

Chris' blog posts on error_code were instructive but incomplete. I found this the most useful.

...

Andrezj's blog posts did the best job of providing tutorial-like guidance but of course I read them too late.

People are still trying to figure out how to use error_code and having trouble because it isn't well explained. Ideas like "if(! ec)" should not be used to check for success do nothing to improve this situation.

This situation is unfortunate and applies to many libraries including Boost and the standard library. I've been trying to promote the addressing of this, but it's not easy for a number of reasons. I do feel a tiny bit of progress is being made. Just a fact that questions/complaints of this nature are raised more frequently is a good sign. Still, the situation is very frustrating. Robert Ramey

Gavin Lambert

21 Jan 21 Jan

10:07 p.m.

On 20/01/2018 07:02, Robert Ramey wrote:

...

...
Whether it was intended or not, established practice is to treat the bool conversion of error_code as false==success and true==failure.

It may be established, but it's not good practice. It's an example where implicit conversion SEEMS to make things simpler and more expressive, but it ends up making things more error prone.

How about if it were spelled if (error) instead? While granted this is technically not what the implementation does, it is what most people intend anyway and does work with all standard error_category implementations. The intent does seem far less ambiguous this way, does it not?

Rob Stewart

11:44 p.m.

On January 19, 2018 1:02:10 PM EST, Robert Ramey via Boost <boost@lists.boost.org> wrote:

...

On 1/19/18 9:16 AM, Vinnie Falco via Boost wrote:

...
Nobody wants to write

if(ec == my_condition::success)

LOL - I want to write this.

...
When then can instead write

if(! ec)

Even in your own code you write `if(! ec)`:

And when I do I want to consider it a mistake.

If there are to be multiple success values, then a comparison is unhelpful. You'd have to test each success value, in turn, to determine whether the code means success. Given the current design, including the behavior of the default constructor, the sizeable number of zero-is-success error categories, and existing practice, using helpful names and conversion to bool is quite workable: if (error) if (!error) To support multiple success values, a redesign is needed. One must be able to ask whether an error_code contains an error or not, and that should be done with a single function call: e.g., is_error(). To avoid the virtual call into the category Niall suggested, many error categories can calculate the answer on construction without any complex mapping (such as zero is success), and that answer can be saved within the object. That means the (relatively small) overhead of a virtual call, and any other logic, is only needed for those multiple success value categories. It remains to be determined what the default constructor, if there should be one, should initialize the state to. Zero is only useful for zero-is-success categories.

...

...
Whether it was intended or not, established practice is to treat the bool conversion of error_code as false==success and true==failure.

It may be established, but it's not good practice. It's an example where implicit conversion SEEMS to make things simpler and more expressive, but it ends up making things more error prone.

That's only true when there are multiple success values or zero is not success. There have been few such cases in practice, I think. -- Rob (Sent from my portable computation device.)

Richard Hodges

22 Jan 22 Jan

11:22 a.m.

It has just occurred to me that the name std::error_code is a misnomer. What this structure actually represents is *result code* and a domain*.* It has been correctly pointed out that sometimes result codes are not errors (e.g. errno == EINTR during a select()) In hindsight perhaps it should have been called std::result_code(code, std::system_call_domain()). As mentioned, I like the outcome<> approach, but for use in things like asio (networking TS), outcome would meed a polymorphic wrapper) since user-defined IO services need not exclusively acquire their error codes from the system_category(). On 22 January 2018 at 00:44, Rob Stewart via Boost <boost@lists.boost.org> wrote:

...

On January 19, 2018 1:02:10 PM EST, Robert Ramey via Boost < boost@lists.boost.org> wrote:

...
On 1/19/18 9:16 AM, Vinnie Falco via Boost wrote:

...
Nobody wants to write

if(ec == my_condition::success)

LOL - I want to write this.

...
When then can instead write

if(! ec)

Even in your own code you write `if(! ec)`:

And when I do I want to consider it a mistake.

If there are to be multiple success values, then a comparison is unhelpful. You'd have to test each success value, in turn, to determine whether the code means success.

Given the current design, including the behavior of the default constructor, the sizeable number of zero-is-success error categories, and existing practice, using helpful names and conversion to bool is quite workable:

if (error) if (!error)

To support multiple success values, a redesign is needed. One must be able to ask whether an error_code contains an error or not, and that should be done with a single function call: e.g., is_error(). To avoid the virtual call into the category Niall suggested, many error categories can calculate the answer on construction without any complex mapping (such as zero is success), and that answer can be saved within the object. That means the (relatively small) overhead of a virtual call, and any other logic, is only needed for those multiple success value categories.

It remains to be determined what the default constructor, if there should be one, should initialize the state to. Zero is only useful for zero-is-success categories.

...
...
Whether it was intended or not, established practice is to treat the bool conversion of error_code as false==success and true==failure.

It may be established, but it's not good practice. It's an example where implicit conversion SEEMS to make things simpler and more expressive, but it ends up making things more error prone.

That's only true when there are multiple success values or zero is not success. There have been few such cases in practice, I think. -- Rob

(Sent from my portable computation device.)

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/ mailman/listinfo.cgi/boost

Asbjørn

11:54 a.m.

On 22.01.2018 12:22, Richard Hodges via Boost wrote:

...

It has just occurred to me that the name std::error_code is a misnomer.

What this structure actually represents is *result code* and a domain*.* It has been correctly pointed out that sometimes result codes are not errors (e.g. errno == EINTR during a select())

In hindsight perhaps it should have been called std::result_code(code, std::system_call_domain()).

FWIW, when making my own async IO library in another language, I was inspired by error_code and made something similar. I quickly came to the same conclusion that the name did not accurately reflect the use, and renamed it op_result. I also found the implicit bool conversion was very error prone, as I managed to trip myself a few times when writing examples. So I got rid of it, and introduced an explicit op_result.success() method which does a virtual call to the category. Anyway, just wanted to share. - Asbjørn

Vinnie Falco

5:08 p.m.

On Mon, Jan 22, 2018 at 3:54 AM, Asbjørn via Boost <boost@lists.boost.org> wrote:

...

I also found the implicit bool conversion was very error prone, as I managed to trip myself a few times when writing examples. So I got rid of it, and introduced an explicit op_result.success() method which does a virtual call to the category.

I would prefer writing `if(ec.success())` and `if(ec.failed())` over `if(! ec)` and `if(ec)` respectively. Which brings up an interesting idea. Perhaps we can first propose to add `error_code::success()` and `error_code::failed()` and get that in, as an immediate improvement over writing legible code. And deprecate (but do not forbid) the use of the `bool` conversion. Then after several years and more experience in the field contemplate the change allowing `error_category` to define the meaning of success, secure in the knowledge that any code which is written or modified to use `success()` and `failed()` will automatically work. Thanks

Niall Douglas

6:59 p.m.

...

I would prefer writing `if(ec.success())` and `if(ec.failed())` over `if(! ec)` and `if(ec)` respectively.

Which brings up an interesting idea. Perhaps we can first propose to add `error_code::success()` and `error_code::failed()` and get that in, as an immediate improvement over writing legible code. And deprecate (but do not forbid) the use of the `bool` conversion. Then after several years and more experience in the field contemplate the change allowing `error_category` to define the meaning of success, secure in the knowledge that any code which is written or modified to use `success()` and `failed()` will automatically work.

If `ec.success()` and `ec.failure()` call virtual functions on their category to return whether a code is success or failure, that works for me. It's a reasonable compromise, and doesn't change existing code. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Niall Douglas

19 Jan 19 Jan

6:05 p.m.

...

People are still trying to figure out how to use error_code and having trouble because it isn't well explained. Ideas like "if(! ec)" should not be used to check for success do nothing to improve this situation.

What do you think of my proposed fix Vinnie, so the boolean conversion would ask the category whether the code means success or failure? That means a virtual function call being added in where there is none currently. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Vinnie Falco

20 Jan 20 Jan

4:27 a.m.

On Fri, Jan 19, 2018 at 10:05 AM, Niall Douglas via Boost <boost@lists.boost.org> wrote:

...

What do you think of my proposed fix Vinnie, so the boolean conversion would ask the category whether the code means success or failure? That means a virtual function call being added in where there is none currently.

I don't have strong feelings about it either way but my default position is "don't mess with it." Especially with the issues caused by shared libraries that provide error categories. Thanks

Paul A. Bristow

11:10 a.m.

...

-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Vinnie Falco via Boost Sent: 19 January 2018 17:16 To: boost@lists.boost.org Cc: Vinnie Falco Subject: Re: [boost] [system] Would it be possible to trial a breaking change to Boost.System and see what happens?

On Sat, Jan 13, 2018 at 8:05 PM, Christopher Kohlhoff via Boost <boost@lists.boost.org> wrote:

...
The error_code class itself deliberately does *not* imbue zero values with the meaning 'success' and non-zero values with the meaning 'failure'. An error_code simply represents an integer and a category, where the category identifies the source of a particular integer value. The specification of the error_code class carefully avoids making any judgement as to whether a particular value represents success or failure. The construct:

if (ec) ...

does not, in and of itself, mean 'if an error ...'. Instead, operator bool is specified to behave as the ints do, and the above construct should simply be read as 'if non-zero ...'.

That might have been the original design intention, but that ship sailed a long time ago. It is well established that users are habituated to think of the bool conversion as meaning success or failure.

Nobody wants to write

if(ec == my_condition::success)

When then can instead write

if(! ec)

But who wants to *read* if(! ec) when it is 'ambuguous'? (and the ! can easily be missed at a glance.) Boost prefers clarity to curtness. if(! ec) is a bad habit - just stop it! Now! Paul PS but don't mess with existing code that works OK.

Robert Ramey

4:34 p.m.

On 1/20/18 3:10 AM, Paul A. Bristow via Boost wrote:

...

...
-----Original Message-----

...

...
Nobody wants to write

if(ec == my_condition::success)

When then can instead write

if(! ec)

But who wants to *read*

if(! ec)

when it is 'ambuguous'? (and the ! can easily be missed at a glance.)

Boost prefers clarity to curtness.

if(! ec) is a bad habit - just stop it! Now!

Paul

PS but don't mess with existing code that works OK.

+1 - good compromise

Andrey Semashev

14 Jan 14 Jan

11:54 a.m.

On 01/14/18 03:27, Niall Douglas via Boost wrote:

...

...
, but having a virtual function call overhead whenever one wants to test if the error_code has a success value seems a too high price for that. I don't want to be paying that price in my Linux programs, sorry.

You have a completely misguided estimation of relative costs here. This argument has no basis in empirical reality.

Also, it's prejudiced as well. Your use case never sees more than one success value, so you fret about the "extra cost" of an indirect jump for testing that. But in my use case in AFIO I have to insert extra if statements to work around the fact that there are multiple success values possible from calls to the NT kernel. So I'm getting loaded with overhead so you don't have to be, and given the whole original design purpose of Boost.System which was to wrap C error code domains into better C++, I don't find that fair.

I assume that the extra `if` statements you're talking about are something like this: if (ec.category() == my_category() && (ec.value() < min_success || ec.value() > max_success)) { // handle failure } How would these tests be optimized if you moved the comparison into the category? The category test would be replaced with a virtual function call (which is arguably more expensive) but the error code value tests would still be executed.

...

(The reason I quote "extra cost" of an indirect jump is that there is no extra cost on any recent out of order CPU including ARM and Intel. It's literally zero cost in the hot cache case because it's been speculatively dereferenced, there isn't even a pipeline stall. Haswell and later can even make a virtual function call immediately calling a virtual function zero overhead in the hot cache case, the speculative execution reaches two levels down most of the time in most circumstances)

I don't think zero cost is a correct estimate. A virtual function call involves loading an entry from the virtual function table, which may stall and pollutes cache. Then there's the function body that consists of a few instructions that also need to be loaded and executed. Compare that to a couple (or four, in case of a range of codes to test for) inlined instructions which are likely already in the cache, prefetched with the rest of the function body. I'm sure the difference will be there, although it may not look that significant in absolute numbers. It will add up if you have multiple error code checks of have them in a loop.

...

...
...
It would still return a reference, just as now. You would check for internal nullptr, if so return a static null_category.

Why check in the first place if you could initialize the pointer to the global instance of the category?

It cannot currently be done portably on all current compilers.

It doesn't need to. `std::system_category` is implemented by compiler writers, they can use whatever magic they need. And they have already done that for iostreams.

...

...
...
But it cannot elide the potential call to an extern function, the function which retrieves the error_category&. And therefore must emit code, just in case the system_category might be fetched.

I don't think the current standard requires an external function call. The default constructor can obtain the pointer to the global category instance directly, if it is exported. `system_category()` can be an inline function as well.

You are incorrect. The C++ standard imposes a hard requirement that all error categories cannot - ever - have more than one instance in a process. That rules out inline sources of error categories.

It does not, if the category instance is exported from a shared library.

...

If you don't implement that, including in custom error categories, bad things happen, specifically error condition testing stops working right.

Custom error categories do not need any special treatment. Only the error category that is used in the default constructor of `error_code` (which is currently the system category) have to be available as an external object and initialized before the rest of the dynamic initialization. All other categories can continue to use function-local statics or whatever.

...

...
I don't have an estimate on how widespread custom error categories are, but I surely have a few. I know that several Boost libraries define their categories (namely, Boost.Filesystem, Boost.ASIO, Boost.Beast, Boost.Thread come to mind). Searching on GitHub[1] finds a few examples of custom categories (with a lot of noise from the comments though). Judging by that, I wouldn't say that custom categories are rare.

But they are however rare enough, and moreover, extremely easy to upgrade. My intended proposal for WG21 is that when you compile code with C++ 23 or later, the new string_view signature takes effect. If you compile with C++ 20 or earlier, the string signature remains.

Sounds like fun for implementers of the error categories that are supposed to be compatible with more than one language version. We already have the painful experience with virtual overrides in codecvt facets that changed at some point.

...

...
Some parts of your proposal (namely, switching to `string_view`) do not have a clear migration path, especially for C++03 users.

You didn't fully read my earlier email. There is no point using std::string_view with earlier than C++ 17. And boost::string_view doesn't have sufficient interop with std::string to substitute.

The std::string_view message() is 100% C++ 17 or later only.

My objection was not just about `std::string_view` being C++17-only, although it was part of it. Not all error categories can be ported to `std::string_view` in principle.

Niall Douglas

15 Jan 15 Jan

12:01 a.m.

...

How would these tests be optimized if you moved the comparison into the category? The category test would be replaced with a virtual function call (which is arguably more expensive) but the error code value tests would still be executed.

Right now I must duplicate those tests at every use case site. If they were inside the category instead, they'd be hot in the branch predictor and microcode cache.

...

...
...
Some parts of your proposal (namely, switching to `string_view`) do not have a clear migration path, especially for C++03 users.

You didn't fully read my earlier email. There is no point using std::string_view with earlier than C++ 17. And boost::string_view doesn't have sufficient interop with std::string to substitute.

The std::string_view message() is 100% C++ 17 or later only.

My objection was not just about `std::string_view` being C++17-only, although it was part of it. Not all error categories can be ported to `std::string_view` in principle.

Is anybody actually generating a totally unique message per invocation for the same error code which couldn't be cached and reused? I've never seen it yet. Almost all use of message() is for printing to human readable output. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Andrey Semashev

10:37 a.m.

On 01/15/18 03:01, Niall Douglas via Boost wrote:

...

Is anybody actually generating a totally unique message per invocation for the same error code which couldn't be cached and reused? I've never seen it yet. Almost all use of message() is for printing to human readable output.

libc++ uses strerror_r or snprintf to generate error messages (see system_error.cpp[1]). libstdc++ uses strerror[2] (which seems like a bug as it is not thread-safe). strerror_r implementation in glibc may dynamically generate the error message, although most of the time it returns one of many constant strings. Either way, users of strerror_r have to assume the result is dynamically generated. You didn't mention how you would solve the problem of thread synchronization to the static storage or cache. In case of a hash map, how would you be removing entries from it? Because if you don't, you're wasting memory. [1]: https://llvm.org/svn/llvm-project/libcxx/trunk/src/system_error.cpp [2]: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/src/c%2B%2B11...

Gavin Lambert

10:48 p.m.

On 15/01/2018 13:01, Niall Douglas wrote:

...

Is anybody actually generating a totally unique message per invocation for the same error code which couldn't be cached and reused? I've never seen it yet. Almost all use of message() is for printing to human readable output.

The most common dynamic message generation case is where a standard message can't be found, so it returns a generic "error #nnn occurred" message. While these *could* be cached, it seems a bit wasteful, especially if nothing eventually tidies them away (but when is it safe to do so?). But there are other cases as well; Windows' FormatMessage allows specifying additional parameters to fill in replacement values for parts of the text, which might be contextual information about eg. which file couldn't be loaded. And the standard error messages usually are defined with these parameters in mind. Granted in the context of an error_code alone being passed around it's unlikely for there to be sufficient context to supply the parameters except back at the original error site (which then isn't an error_code any more) -- but given something like outcome which can pass the error code *and* the necessary parameters to fill in the message when it does eventually get converted to text, that becomes more useful.

Olaf van der Spek

13 Jan 13 Jan

8:44 p.m.

On Sat, Jan 13, 2018 at 12:58 AM, Niall Douglas via Boost <boost@lists.boost.org> wrote:

...

It would still return a reference, just as now. You would check for internal nullptr, if so return a static null_category.

Can't you just store the pointer to null_category?

Peter Dimov

12:15 a.m.

Andrey Semashev wrote:

...

...
3, Make the default constructor constexpr, now possible. No code should notice a difference, except less runtime code will be generated.

Not sure there will be any difference in the generated code, since the constructor still has to initialize the int and the pointer.

constexpr guarantees static initialization, and therefore that no additional code needs to run except for the initialization of the int and the pointer. Specifically, that system_category() does not do anything else besides returning a pointer. At present, its straightforward implementation initializes a function-local variable, which is required to be thread-safe and therefore requires synchronization.

Andrey Semashev

1:18 p.m.

On 01/13/18 03:15, Peter Dimov via Boost wrote:

...

Andrey Semashev wrote:

...
...
3, Make the default constructor constexpr, now possible. No code should > notice a difference, except less runtime code will be generated.

Not sure there will be any difference in the generated code, since the constructor still has to initialize the int and the pointer.

constexpr guarantees static initialization, and therefore that no additional code needs to run except for the initialization of the int and the pointer. Specifically, that system_category() does not do anything else besides returning a pointer. At present, its straightforward implementation initializes a function-local variable, which is required to be thread-safe and therefore requires synchronization.

I don't think the standard requires this implementation. `system_category()` may return a pointer to a global caregory instance, which is initialized before `main`. Regarding static initialization and constexpr, again, I'm not sure I see how useful this is. I've never needed global `error_code` instances, let alone statically initialized ones. I think most of my uses of `error_code` are either local to function scope or a member of an exception class or something like that. In any case, I believe the default constructor of `error_code` can still be constexpr if it takes a pointer of the global category instance directly.

Peter Dimov

3:05 p.m.

Andrey Semashev wrote:

...

I don't think the standard requires this implementation. `system_category()` may return a pointer to a global caregory instance, which is initialized before `main`.

You're not prohibited from calling system_category() before main, so it has to work.

Andrey Semashev

3:14 p.m.

On 01/13/18 18:05, Peter Dimov via Boost wrote:

...

Andrey Semashev wrote:

...
I don't think the standard requires this implementation. `system_category()` may return a pointer to a global caregory instance, which is initialized before `main`.

You're not prohibited from calling system_category() before main, so it has to work.

Sure. Standard library implementeds have solved the problem of early initialization for the standard iostreams, and they can do the same for the system category instance.

Peter Dimov

3:20 p.m.

...

Andrey Semashev wrote:

...
I don't think the standard requires this implementation. `system_category()` may return a pointer to a global caregory instance, which is initialized before `main`.

You're not prohibited from calling system_category() before main, so it has to work.

You can see for yourself what happens in practice today: https://godbolt.org/g/1TfPQw The g++/clang++ generated code looks cleaner than that, but it's because it doesn't include system_category(). Here it is in libc++: https://github.com/llvm-mirror/libcxx/blob/master/src/system_error.cpp#L211 MS's code is even more convoluted because their <system_error> works during process shutdown, after static destruction. But the magic static cost is the same in either case.

Andrey Semashev

3:35 p.m.

On 01/13/18 18:20, Peter Dimov via Boost wrote:

...

...
Andrey Semashev wrote:

...
I don't think the standard requires this implementation. > `system_category()` may return a pointer to a global caregory instance, > which is initialized before `main`.

You're not prohibited from calling system_category() before main, so it has to work.

You can see for yourself what happens in practice today: https://godbolt.org/g/1TfPQw

The g++/clang++ generated code looks cleaner than that, but it's because it doesn't include system_category(). Here it is in libc++: https://github.com/llvm-mirror/libcxx/blob/master/src/system_error.cpp#L211

MS's code is even more convoluted because their <system_error> works during process shutdown, after static destruction. But the magic static cost is the same in either case.

Here's the gist of what I propose: https://godbolt.org/g/nSiEZZ It's as optimal as it can get, IMHO.

Peter Dimov

3:49 p.m.

Andrey Semashev wrote:

...

Here's the gist of what I propose:

https://godbolt.org/g/nSiEZZ

It's as optimal as it can get, IMHO.

It's also what I propose, except I make it mandatory.

Edward Diener

12 Jan 12 Jan

8:38 p.m.

On 1/12/2018 12:43 PM, Niall Douglas via Boost wrote:

...

SG14 the low latency study has been looking into how to improve some design decisions in <system_error>. The relevant WG21 paper is https://wg21.link/P0824 and recent discussion can be found at https://groups.google.com/a/isocpp.org/forum/#!topic/sg14/j7tQybEjP5s.

What we'd like to do is to test whether some very mildly breaking changes to <system_error> are so mild as to affect no users, and on that basis propose WG21 to make those mildly breaking changes to <system_error> in the next C++ standard. And to test this, we'd like to modify Boost.System with those mildly breaking changes, ship a release and see how many users complain. If it is zero, we have evidence for WG21 that this is not a consequential change.

The usual way is to announce the upcoming change in release nnn and then make the change in release nnn + 1. What is wrong with using this method in your case, if all parties are in agreement to do it ?

...

What we'd like to change is this:

1. Stop treating code zero as always meaning success irrespective of category. This remedies a defect where some custom error code domains cannot be represented by error_code due to using value 0 for an error. In actual code, I've never seen anyone ever use comparison to anything but a default constructed error code, so I think this will be safe.

2. Success becomes solely the default constructed error code, which is code zero and a new category of "null_category". This is internally represented by { 0, nullptr }, and thus makes the default constructor trivial which is highly desirable as it eliminates any compiler magic static fencing for the default constructor. Right now the default constructor uses system_category with value 0, and I suspect no code will notice this change either.

3, Make the default constructor constexpr, now possible. No code should notice a difference, except less runtime code will be generated.

4. error_code::message() returns a std::string_view instead of std::string if running C++ 17 or better. This lets us remove the <string> dependency, and thus stop dragging in half the STL with <system_error> which in turn makes <system_error> actually useful to the embedded systems crowd. The guess here is that std::string_view has such excellent interoperation with std::string that I think 99% of code will compile and work perfectly without anybody noticing a thing.

I appreciate that this would undo Peter Dimov's excellent work at having Boost.System alias <system_error> under C++ 11 and later. I also appreciate that we are deliberately going to potentially break end user code. But my personal guess is that in practice, the breakage will not be noticed by 99% of code out there right now. It'll just compile against the improved Boost.System and everything will work as before.

Boost was originally intended as an incubator for C++ standard changes. This ticks that box perfectly.

Thoughts on feasibility?

Niall

Niall Douglas

13 Jan 13 Jan

12:15 a.m.

...

...
What we'd like to do is to test whether some very mildly breaking changes to <system_error> are so mild as to affect no users, and on that basis propose WG21 to make those mildly breaking changes to <system_error> in the next C++ standard. And to test this, we'd like to modify Boost.System with those mildly breaking changes, ship a release and see how many users complain. If it is zero, we have evidence for WG21 that this is not a consequential change.

The usual way is to announce the upcoming change in release nnn and then make the change in release nnn + 1. What is wrong with using this method in your case, if all parties are in agreement to do it ?

It goes without saying that this forthcoming change would be announced one release ahead as per Boost custom. So, next release would have it #ifdef'd off by default, following release #ifdef's it on by default. A macro would allow users to select which implementation. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

2743

Age (days ago)

2753

Last active (days ago)

List overview

Download

76 comments

15 participants

participants (15)

Andrey Semashev
Asbjørn
charleyb123 .
Christopher Kohlhoff
Edward Diener
Gavin Lambert
Niall Douglas
Olaf van der Spek
Paul A. Bristow
Peter Dimov
Richard Hodges
Rob Stewart
Robert Ramey
Roberto Hinz
Vinnie Falco