[boost] Proposed SG14 <system_error2> ready for feedback

28 Feb 2018

      Back after the Outcome discussion died down, I started a thread
discussing making breaking changes to Boost.System to test the
feasibility of fixing some of the unfortunate design choices now
apparent through hindsight in `<system_error>`. I won one argument -
constexprification - the others I lost. I said at the time that I'd go
off and make a better mousetrap i.e. a proposed `<system_error2>`.

That proposed `<system_error2>` is now ready for feedback:

Single file include:
https://github.com/ned14/status-code/raw/develop/single-header/system_error2...

Github: https://github.com/ned14/status-code

Docs: https://ned14.github.io/status-code/

Lots of detail about the differences between proposed `<system_error2>`
and `<system_error>` is in the front page of the docs and pasted after
this email, but essentially it fixes all the problems listed in
https://wg21.link/P0824, and a few other problems I considered important
as well. It works well in C++ 11, is believed to be freestanding C++
(https://wg21.link/P0829) friendly, and generates really lovely and
tight codegen. If SG14 smile upon it in the April meeting, it'll be
heading to Rapperswil for standardisation.

Boost members should be aware that there will be shortly a push by the
WG21 leadership to improve the current state of exception and error
handling in the C++ standard as it is becoming increasingly obvious that
the current design is no longer sufficient. You may see a paper on that
from the leadership at Jacksonville, if not then fairly definitely you
will at Rapperswil. This proposed `<system_error2>` *may* have a part to
play in the reform proposals, if it is felt that this approach is a wise
one by SG14, you guys here, and then WG21.

So, as a quick introduction given that there is no documentation,
instead of `error_code` we have a `status_code<D>` where `D` is the type
of the code's *domain*. The domain provides all of the heavy lifting so
status code itself can be trivial. Status codes can be type erased of
their domain, you can always erase to `status_code<void>` but at the
cost that it cannot be copied, moved nor deleted. If your domain's value
type is trivially copyable, it can also be type erased into any
`status_code<erased<U>>` where `U` is an integer type, and the size of
the erased status code is bigger or equal to its unerased form.

There is a special status_code called `generic_code`, which is a typedef
for `status_code<_generic_code_domain_impl>`. Its value type is `errc`,
which is a slight superset of `std::errc`. The reason it is special is
because it is the baseline error code, all other codes must be
semantically comparable to it, if that is possible. In current STL
speak, `generic_code` is what `std::error_condition` with a
`std::generic_category` is.

On all POSIX platforms, we provide a `posix_code`. Its value type is
`int`, as from `errno`. This maps the local POSIX implementation's error
coding. It is *not* necessarily equal to `generic_code`, but usually is
a superset, so all codes in `generic_code` can exist in `posix_code`,
but not the other way round.

On Windows, we have three main system error coding systems, so we have
`win32_code`, `nt_code` and `com_code`. Their value types are `DWORD`
from GetLastError(), `LONG` from `NTSTATUS`, and `HRESULT` from
Microsoft COM.

On all systems, there is a typedef `system_code` which is to the erased
status code sufficiently large that you are guaranteed that all possible
system error coding schemes can be safely erased into it. For POSIX,
this is `status_code<erased<int>>`, as `int` can hold all possible error
codings. For Windows, this is `status_code<erased<intptr_t>>`, as
`intptr_t` can hold any of a `DWORD`, a `LONG` and a `HRESULT`.

All comparisons between status codes are *semantic*. As in,
`operator==()` returns true if the two types are semantically equal e.g.
`win32_code(ERROR_SUCCESS) == generic_code(errc::success) ==
com_code(S_OK)`. Semantic comparison is the only form of comparison, if
you want literal comparison, it can be done by hand by comparing value
and domain by hand.

There is no boolean testing at all, as it is ambiguous as we learned
with `std::error_code`. One always writes: `if(sc.failure()) ...` etc.

`status_code` can represent success codes, failure codes, and empty.
Empty is there to say "no code was set". Empty is neither a success, nor
a failure. It is there for code where a virtual function call to
determine success/failure is too expensive and where the domain is known
for a fact to only ever represent failure. You should only ever use
empty as a local optimisation, and never ever expose empty status codes
to code that you don't 100% control.

And that's basically it. Idiomatic usage is for extern functions to
return type erased status codes or to take one by lvalue reference. Code
local to the translation unit uses typed status codes, and lets any
implicit conversion into the erased form occur as needed.

I've copy and pasted relevant excerpts from the Readme.txt below for
those not willing to click on a link. If you have any questions, please
do shout. My next step is to start work on preparing Outcome for entry
into Boost now a month of settling time has passed since the review.
Once the Jacksonville meeting has finished and I've done any
post-meeting correspondence, I'll be starting on my (currently five!)
papers for Rapperswil where I'll hopefully be attending my very first
WG21 meeting!

Niall

--- Readme.md (excerpt) --

Solves the problems for low latency/large code base users with
`<system_error>` as listed by [WG21 P0824](https://wg21.link/P0824).
This proposed `<system_error2>` library is EXPERIMENTAL and is subject
to change as the committee evolves the design. To fetch a drop-in
standalone single file implementation:

```
wget
https://github.com/ned14/status-code/raw/develop/single-header/system_error2...
```

## Features:

- Portable to any C++ 11 compiler. These are known to work:
    - >= GCC 5 (due to requiring libstdc++ 5 for sufficient C++ 11
type traits)
    - >= clang 3.3 with a new enough libstdc++ (previous clangs don't
implement inheriting constructors)
    - >= Visual Studio 2015 (previous MSVC's don't implement
inheriting constructors)
- Aims to cause zero code generated by the compiler most of the time.
- Never calls `malloc()`.
- Header-only library friendly.
- Type safe yet with type erasure in public interfaces so it can scale
across huge codebases.
- Minimum compile time load, making it suitable for use in the global
headers of multi-million line codebases.

## Problems with `<system_error>` solved:

1. Does not cause `#include <string>`, and thus including the entire STL
allocator and algorithm machinery, thus preventing use in freestanding
C++ as well as substantially impacting compile times which can be a
showstopper for very large C++ projects. Only includes the following
headers:

    - `<atomic>` to reference count localised strings retrieved from the OS.
    - `<cassert>` to trap when misuse occurs.
    - `<cerrno>` for the generic POSIX error codes (`errno`) which is
required to define `errc`.
    - `<cstddef>` for the definition of `size_t` and other types.
    - `<cstring>` for the system call to fetch a localised string and C
string functions.
    - `<exception>` for the basic `std::exception` type so we can
optionally throw STL exceptions.
    - `<initializer_list>` so we can permit in-place construction.
    - `<new>` so we can perform placement new.
    - `<type_traits>` as we need to do some very limited metaprogramming.
    - `<utility>` if on C++ 17 or later for `std::in_place`.

    These may look like a lot, but in fact just including `<atomic>` on
libstdc++ actually brings in most of the others in any case, and a total
of 200Kb (8,000 lines) of text is including by `system_error2.hpp` on
libstdc++ 7. Compiling a file including `status_code.hpp` takes less
than 150 ms with clang 3.3 as according to the `-ftime-report`
diagnostic (a completely empty file takes 5 ms).

2. Unlike `std::error_code` which was designed before `constexpr`, this
proposed implementation has all-`constexpr` construction and destruction
with as many operations as possible being trivial or literal, with only
those exact minimum operations which require runtime code generation
being non-trivial (note: requires C++ 14 for a complete implementation
of this).

3. This in turn means that we solve a long standing problem with
`std::error_category` in that it is not possible to define a safe custom
C++ 11 error category in a header only library where semantic
comparisons would randomly break depending on the direction of wind
blowing when the linker ran. This proposed design is 100% safe to use in
header only libraries.

4. `std::error_code`'s boolean conversion operator i.e. `if(ec) ...` has
become unfortunately ambiguous in real world C++ out there. Its correct
meaning is "if `ec` has a non-zero value". Unfortunately, much code out
in the wild uses it as if "if `ec` is errored". This is incorrect,
though safe most of the time where `ec`'s category is well known i.e.
non-zero values are always an error. For unknown categories supplied by
third party code however, it is dangerous and leads to unpleasant,
hard-to-debug, surprise.

    The `status_code` proposed here suffers from no such ambiguity. It
can be one of exactly three meanings: (i) success (ii) failure (iii)
empty (uninitialised). There is no boolean conversion operator, so users
must write out exactly what they mean e.g. `if(sc.success()) ...`,
`if(sc.failure()) ...`, `if(sc.empty()) ...`.

5. Relatedly, `status_code` can now represent successful (informational)
codes as well as failure codes. Unlike `std::error_code` where zero is
given special meaning, we impose no requirements at all on the choice of
coding. This permits safe usage of more complex C status coding such as
the NT kernel's `NTSTATUS`, which is a `LONG` whereby bits 31 and 30
determine which of four categories the status is (success,
informational, warning, error), or the very commone case where negative
numbers mean failure and positive numbers mean success-with-information.

6. The relationship between `std::error_code` and `std::error_condition`
is confusing to many users reading code based on `<system_error>`,
specifically when is a comparison between codes *semantic* or *literal*?
`status_code` makes all comparisons *semantic*, **always**. If you want
a literal comparison, you can do one by hand by comparing domains and
values directly.

7. `std::error_code` enforced its value to always be an `int`. This is
problematic for coding systems which might use a `long` and implement
coding namespaces within the extended number of bits, or for end users
wishing to combine a code with a `void *` in order to transmit payload
or additional context. As a result, `status_code` is templated to its
domain, and the domain sets its type. A type erased edition of
`status_code<D>` is available as `status_code<void>`, this is for
obvious reasons non-copyable, non-movable and non-destructible.

    A more useful type erased edition is `status_code<erased<T>>`
which is available if `D::value_type` is trivially copyable, `T` is an
integral type, and `sizeof(T) >= sizeof(D::value_type)`. This lets you
use `status_code<erased<T>>` in all your public interfaces without
restrictions. As a pointer to the original category is retained, and
trivially copyable types may be legally copied by `memcpy()`, type
erased status codes work exactly as normal, except that publicly it does
not advertise its type.

8. `std::system_category` assumes that there is only one "system" error
coding, something mostly true on POSIX, but not elsewhere. This library
defines `system_code` to a type erased status code sufficiently large
enough to carry any of the system error codings on the current platform.
This allows code to construct the precise error code for the system
failure in question, and return it type erased from the function.
Depending on the system call which failed, a function may therefore
return any one of many system code domains.

9. Too much `<system_error>` code written for POSIX uses
`std::generic_category` when they really meant `std::system_category`
because the two are interchangeable on POSIX. Further confusion stems
from `std::error_condition` also sharing the same coding and type. This
causes portability problems. This library's `generic_code` has a value
type of `errc` which is a strong enum. This prevents implicit confusion
with `posix_code`, whose value type is an `int` same as `errno` returns.
There is no distinction between codes and conditions in this library,
rather we treat `generic_code` as something special, because it
represents `errc`. The cleanup of these ambiguities in `<system_error>`
should result in users writing clearer code with fewer unintended
portability problems.

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

[boost] Proposed SG14 <system_error2> ready for feedback

Niall Douglas