On 01/16/18 16:54, Peter Dimov via Boost wrote:
Andrey Semashev wrote:
In the generated code I noticed that the compiler generated a check of whether the virtual function `failure` is actually `experimental::error_category::failure`. If it is, the code uses an inlined version of this function (otherwise, the actual indirect call is performed). So if you comment `code_category_impl::failure` the test succeeds and the indirect call is avoided. Here are the results for this case:
Experimental test: 71711 usec, 1394486201.559036 tests per second std test: 48177 usec, 2075679266.039812 tests per second
This is still a 1.5x difference.
That's pretty good. In practice, if the function does something nontrivial, this amount of overhead will be entirely lost in the noise.
Probably, although I think this number is a bit optimistic because the branch predictor is trained by the loop and that the indirect call is disfavored (it's been moved out of the loop). My worry is that we may not even achieve this number if the compiler is not capable or for some reason not able to perform the optimization. The worst case overhead of 6x looks much more grim.