Re: [boost] Reforming Boost.System and <system_error> round 2

16 Jan 2018

      On 01/16/18 00:18, Niall Douglas via Boost wrote:
...
That leaves the request to fix "if(ec) ..." which right now returns true
if the value is 0, despite that much code writes "if(ec) ..." to mean
"if error then ...". There is also the issue of error coding schemes not
being able to have more than one success value, which usually must be 0.
That's the remaining discussion point.
Question: is this still considered too much overhead?
I think the test you presented is rather optimistic in that it is 
comprised of a single translation unit. I think, in real applications 
the following are more common:

- The error category is often implemented in a separate translation unit 
from the code that sets or tests for error codes with that category. 
This follows from the existing practice of declaring the category 
instance as a function-local static, where the function is defined in a 
separate TU.

- The code that sets the error code is often in a separate TU than the 
code that tests for errors. This follows from the typical separation 
between a library and its users.

Given the above, unless LTO is used, I think the compiler will most 
often not be able to optimize the virtual function call.

I've converted your code to a synthetic benchmark, consisting of one 
header and two translation units (one with the test itself and the other 
one that defines the error category). The test still does not isolate 
the code of producing the error code from the code that analyzes it, so 
in that regard it is still a bit optimistic.

I'm using gcc 7.2 and compiling the code with -O3. Here are the results 
on my Sandy Bridge CPU:

Experimental test: 275565 usec, 362890788.017346 tests per second
std test: 45767 usec, 2184980444.425023 tests per second

This is a 6x difference.

In the generated code I noticed that the compiler generated a check of 
whether the virtual function `failure` is actually 
`experimental::error_category::failure`. If it is, the code uses an 
inlined version of this function (otherwise, the actual indirect call is 
performed). So if you comment `code_category_impl::failure` the test 
succeeds and the indirect call is avoided. Here are the results for this 
case:

Experimental test: 71711 usec, 1394486201.559036 tests per second
std test: 48177 usec, 2075679266.039812 tests per second

This is still a 1.5x difference.

Now, I admit that this synthetic benchmark solely focuses on the one 
check for the error code value. The real applications will likely have 
much more code intervening the tests for error codes. The real world 
effect will be less pronounced. Still, I thought some estimate of the 
performance penalty would be useful, if only to show that this is not a 
zero overhead change.

Do I think this overhead is significant enough? Difficult to tell. 
Certainly I'm not happy about it, but I could probably live with the 
1.5x overhead. However, it still results in code bloat and there is no 
guarantee this optimization will be performed by the compiler (or that 
it will be effective if e.g. my code always overrides 
`error_category::failure`). Thing is, every bit of overhead makes me 
more and more likely consider dropping `error_code` in favor of direct 
use of error codes. `error_code` already carries additional baggage of 
the pointer to error category.

Re: [boost] Reforming Boost.System and <system_error> round 2

Andrey Semashev