Re: [boost] Boost.Uuid and header-only support

5 Nov 2017

      On Sat, Nov 4, 2017 at 9:37 PM, Peter Dimov via Boost <boost@lists.boost.org
...
wrote:
...
James E. King, III wrote:
...
On Sat, Nov 4, 2017 at 8:53 PM, Peter Dimov via Boost <
boost@lists.boost.org wrote:
...
James E. King, III wrote:
...
I was wondering, does it even make sense to have the default RNG of >>
uuids::random_generator set to a PseudoRandomNumberGenerator for >>
boost::uuid?
No, in my opinion it doesn't. basic_random_generator has to be retained
for compatibility, but random_generator should just obtain random bytes >
directly. You're right that this is a breaking change though - a >
justified one, in my opinion.
Looks like November 1 was the deadline for making major changes for
1.66.0, which I assume would include breaking changes...
That's true, in principle.
The intent behind the rule is however that this refers to changes that
break other Boost libraries, not client code outside of Boost. Client code
typically sees the breaking change after the release, or at best after the
beta, not before. So breaking client code does not in general impede the
release process, whereas breaking other Boost libraries (or otherwise
introducing regressions into the testing and release procedures) most
certainly can and does.
But a permission from a release manager cannot hurt if you decide to
pursue this.
I did a short test where I use the old mt19937 based generator in a loop to
make 1 million uuids
and then I compared that to a new random_device implementation (both under
basic_random_generator using the default constructor), and the results
surprised me.  On
a Windows 10 laptop with a Xeon processor, using Visual Studio 2017 and
building release:

Benchmark Times reusing a generator (1M loops):
  old implementation:
 0.021822s wall, 0.031250s user + 0.000000s system = 0.031250s CPU (143.2%)
  new implementation:
 0.373160s wall, 0.375000s user + 0.000000s system = 0.375000s CPU (100.5%)

Benchmark Times using a new generator for each uuid (10K loops):
  old implementation:
 1.168479s wall, 1.171875s user + 0.000000s system = 1.171875s CPU (100.3%)
  new implementation:
 0.010272s wall, 0.015625s user + 0.000000s system = 0.015625s CPU (152.1%)

Asking the OS for 16 bytes of entropy at a time (Wincrypt, 1M loops):
 0.244231s wall, 0.234375s user + 0.000000s system = 0.234375s CPU (96.0%)

So the initialization of the mersenne twister is expensive, so if you do it
once and then
reuse it, it is much more efficient than going to Wincrypt.   If you make a
new generator for
each uuid, the opposite is true.  I expect folks will do the former, but
I've seen the latter
in some projects.

It looks like it is very expensive to get entropy through Wincrypt (and
BCrypt), whereas using
it only to seed a PRNG and then reusing the PRNG it is much more efficient
through the existing
code paths.  Therefore keeping the default type at mt19937 and using this
newer header-only
random_device will maintain performance and compatibility, but also fix the
problems in seed_rng
like how it would ignore errors, and will add UWP support.

Even asking Wincrypt for 16 bytes of entropy at a time is not efficient, so
rewriting the
implementation to be less generic won't improve performance.

So in short, I'll plan to keep the same implementation of random_device but
put in a new seed
mechanism for the PseudoRandomNumberGenerator.  I also fixed up the
template specialization
that was in seed_rng so that any UniformRandomNumberGenerator can be used.
Just
not sure this will go into 1.66.0 at this point... might be too big of a
structural change internally.

- Jim