Re: [boost] [xint] Boost.XInt formal review

7 Mar 2011

      Hello All,

There is my formal review of Boost.Xint.

                         DISCLAIMER
--------------------------------------------------------------------
  Char Nelson is the review manager of my Boost.Locale library
  that is going to be reviewed in April.

  So my judgment may be affected by this fact as I see Char Nelson
  exactly at same position that I would be very soon.
--------------------------------------------------------------------
...
Please explicitly state in your review whether the library should be accepted
YES.

However, I expect that algorithms performance would be improved ASAP.
...
- What is your  evaluation of the design?
Pros:
-----

Even I do not really like heavily "templatized" libraries
I like the overall interface design. Simple and quite straight forward.

After actually looking to the library deeply I think that having
COW and (default) Threadsafe options is brilliant idea.

It gives **very good** control over what you actually need allowing
you to use cow integers where performance is critical and normal
outside.

I found it very convenient, configurable and powerful.

All important operations are implemented, quite intuitive interface
(after reading examples... documentation notes would be added)

Cons/Problems:
--------------
Points I think that are wrong:

- Move Semantics:

  It is not clear, and not documented!

        boost::xint::integer a,b;
        a = 100;
        b = boost::move(a);  
        // expect like swap(a,b) acts like a=b
        std::cout << a<<" " <<b << std::endl;
        // expect 0 100, get 100, 100

  integer is container and should behave like container
  in move context and like like a primitive type - mostly
  for optimization.

- Copy between different types (bug of feautre?):

  integer<opt::secure> a;

  // works
  integer<opt::threadsafe> b;
  a=b; 

  // works
  integer<opt::threadsafe> b(a);

  // Does not!
  integer<opt::threadsafe> b = a;

  It seems that copy constructor explicit
  without any reason.
...
- What is your  evaluation of the implementation?
I can't judge the code as I do not so familiar with such complicated
meta-programming but I have following notes:

Style:
------

1. The coding is too dense. Complex code requires much more comments
   especially when it is not trivial.

2. There are lots of template classes and subclasses and other relations.
   I think internals should be documented otherwise in a year or
   two nobody would understand what is going on here.

   So good white paper about internals should be given.

Random number generation under Linux
-------------------------------------

It is just a bug... I had this one in my code once so, not a negative
point but just something small to fix:

Random Generator Under Linux/Unix

  rng("/dev/urandom", std::ios::binary)

  - By default lots of bytes are going to be
    read due to buffering - very inneficient

    Reading 32 bytes would actually read about 4k
    due to buffering it has huge impact on performance.

    - Set empty buffer to stream so it would not do caching
    - Use open(2), read(2) calls

  - No need ios::binary (but does not hurt) - under Unix all
    streams are same.

Performance:
------------

This is the weakest point of this library, the performance of too
many algorithms is very low, I compared it with GMP/gmpxx interface
and found following

Run Time    
-----------------------------------------------
op     xint        gmpxx         x?
----------------------------------------------
*      13.6 us     1.4 us        *9.7
/      52.8 us     1.6 us        *33
+      0.9 us      0.51 us       *1.76
powmod 165 ms      4 ms          *41
sqrt   520 us      2.3 us        *226

I can accept a difference of x2-x5 because GMP
is havily optimized (and written in C and I had
seen some code that C compiler optimizes much better
then C++).

Some algorithms should be rewritten with the best known
solutions.

Nobody expects them to be as fast as hadly optimized
tools written in assembly but they should be comparable
for big O notations, otherwise users would
prefer to wrap GMP with something else and use it.

Finally with all COW/Threasafty/Atomics the actual
algoriths were "forgotten".
...
- What is your  evaluation of the documentation?
Generally it was fine but not good enough.

Documentation mostly missed following parts:

1. It was not clear how to use options. Clear examples are
   required for each significant option.

2. It is not always clear what are the default options.

3. When dealing with cryptography, use of so-called secure
   memory is required, thus there must be a very clear
   example on how to use allocators with xint class.

   It is very unclear what should be done.

   I've got an answer by direct asking, but I think documentation
   requires explicit example of definition of new allocator
   with xint. For example based on: VirtualAlloc, and VirtualLock
   API.
...
- What is your  evaluation of the potential usefulness of the library?
Very, very useful.
...
- Did you try to use the library?  With what compiler?  Did you  have 
any
        problems?
I had written several small benchmarks against GMP, tryed some small examples
with gcc-4.3 and 4.5. Hadn't seen any specific issues.
...
- How much effort did you put into your evaluation? A glance? A quick
         reading? In-depth study?
I've spend several hours reading docs, interfaces and trying some code.
I've read parts I can actually understand as this level of template 
meta-programming
is quite too high for me.
...
-  Are you knowledgeable about the problem  domain?
I have some basic knowledge about cryptography familiar with cryptographic
algorithms and tools I don't have deep knowledge in big number arithmetics.

Additional notes and information:
---------------------------------

I compared it with GNU GMP gmpxx interface:

- xint is much easier to use
- xint is much easier to configure in security terms.
- xint has much more convenient interaces with gmpxx when talking about
  more comlex tasks.
- xint is has significantly lower performance in too many critical
  algorithms like sqrt and powmod - this should be fixed and difference
  goes to difference of order of magnitude.

Bottom Line and Vote:
---------------------

YES, it should be included in boost.

However I have a request (almost condition):
--------------------------------------------

First problem that should be solved are actually algorithms complexities,
I don't think it is something too hard. Reading few books
and papers would guide a strightforward direction to the solution.

Nobody cares about how beautiful template metaprogramming is if simple
wrapper of GMP outperforms it by order of magnitude!

It feels for me that most of the "Buzz" around Xint had deal
with various request from 1001 users that each one thinks that
it can be done better this way.

Finally it was good and created very good interface but, algorithms were 
forgotten
and when they slow "To COW or not to COW" does not really matter

Improve them! This is the most important part!

Why do I vote "Yes" even I think it is slow:

a) Interfaces are very good and useful, no more changes
   are really needed from user point of view.

b) The problem is solvable and is solely xint internal, so it
   should not prevent merging it into boost tree

c) I had seen much more critical issues in live boost libraries,
   so I think this issue is not showstopper for inclusion.

   (Anyway nobody reviews acutal algorithms implementations!)

Additional Recommendation:
--------------------------

- Change move semantics to same as containers have.
- Documentation should be improved with much more examples.

Artyom Beilis.