On 01/16/18 00:18, Niall Douglas via Boost wrote:
That leaves the request to fix "if(ec) ..." which right now returns true if the value is 0, despite that much code writes "if(ec) ..." to mean "if error then ...". There is also the issue of error coding schemes not being able to have more than one success value, which usually must be 0. That's the remaining discussion point.
Question: is this still considered too much overhead?
I think the test you presented is rather optimistic in that it is comprised of a single translation unit. I think, in real applications the following are more common: - The error category is often implemented in a separate translation unit from the code that sets or tests for error codes with that category. This follows from the existing practice of declaring the category instance as a function-local static, where the function is defined in a separate TU. - The code that sets the error code is often in a separate TU than the code that tests for errors. This follows from the typical separation between a library and its users. Given the above, unless LTO is used, I think the compiler will most often not be able to optimize the virtual function call. I've converted your code to a synthetic benchmark, consisting of one header and two translation units (one with the test itself and the other one that defines the error category). The test still does not isolate the code of producing the error code from the code that analyzes it, so in that regard it is still a bit optimistic. I'm using gcc 7.2 and compiling the code with -O3. Here are the results on my Sandy Bridge CPU: Experimental test: 275565 usec, 362890788.017346 tests per second std test: 45767 usec, 2184980444.425023 tests per second This is a 6x difference. In the generated code I noticed that the compiler generated a check of whether the virtual function `failure` is actually `experimental::error_category::failure`. If it is, the code uses an inlined version of this function (otherwise, the actual indirect call is performed). So if you comment `code_category_impl::failure` the test succeeds and the indirect call is avoided. Here are the results for this case: Experimental test: 71711 usec, 1394486201.559036 tests per second std test: 48177 usec, 2075679266.039812 tests per second This is still a 1.5x difference. Now, I admit that this synthetic benchmark solely focuses on the one check for the error code value. The real applications will likely have much more code intervening the tests for error codes. The real world effect will be less pronounced. Still, I thought some estimate of the performance penalty would be useful, if only to show that this is not a zero overhead change. Do I think this overhead is significant enough? Difficult to tell. Certainly I'm not happy about it, but I could probably live with the 1.5x overhead. However, it still results in code bloat and there is no guarantee this optimization will be performed by the compiler (or that it will be effective if e.g. my code always overrides `error_category::failure`). Thing is, every bit of overhead makes me more and more likely consider dropping `error_code` in favor of direct use of error codes. `error_code` already carries additional baggage of the pointer to error category.