Re: [boost] [xint] Boost.XInt formal review

3 Mar 2011

      On 3/3/2011 6:57 AM, Chad Nelson wrote:
...
On Wed, 02 Mar 2011 13:46:33 -0800
"Jeffrey Lee Hellrung, Jr."<jhellrung@ucla.edu>  wrote:
...
1) I was hoping the bignum algorithms would be factored out and
allowed to operate on, e.g., ranges of digits, rather than being
coupled to the xint::integer_t data structure.  I know this had been
mentioned before.
And I thought I had answered it, by moving almost all of the functions
out of the integer_t class. Apparently I misunderstood your intent,
though it's something I'd have wanted to do eventually anyway.
What difference, from the point-of-view of the interface or from the 
user, does it make if the algorithms are implemented in the integer_t 
class scope or a detail namespace scope?  It seems you misunderstood the 
desire for decoupling the algorithms from the data structure.  See also 
Mathias' email regarding "genericity"; I agree with his evaluation in 
this regard and won't repeat it.

Is this decoupling the "something [you]'d have wanted to do eventually 
anyway", or are you referring to something else?
...
...
Chad, what are your thoughts on this?  I know the purpose of this
library is to provide out-of-the-box bignum functionality with
minimal hassle, but, to me, the algorithms should be part of the
interface as well and decoupled from the xint::integer_t data
structure.
Maybe I'm missing something... the only thing I see that that would
accomplish, that the current design doesn't, is to allow people to
operate on arbitrary slices of data. Is there really enough need for
that to warrant it?
You can't assume that everyone will be satisfied with your integer_t 
data structure.  Someone may want a data structure that uses the 
small-object optimization, because they have a lot of tiny integers 
running around and occasionally have large ones.  Or they may want to 
use a std::deque(-like) underlying storage.  I don't know.  But I do 
know that your library will be strictly more usable if you decouple the 
algorithms from the integer_t data structure.

I'm hesitant to vote for acceptance unless this decoupling were 
effected; your mileage with other boosters will obviously vary, but I 
gather Mathias is of the same opinion.
...
...
2) In a similar vein, I would like to see the cryptographic
functionality decoupled from the xint::integer_t data structure as
well, and released separately.
That's one of the most common reasons that people want to use a
large-integer library. Separating those functions from the library
would make it that much harder for potential users to get everything
working.
Can you be more specific on *how* it would be "much harder", from the 
user's perspective?
...
...
Also, good things: I like the policy-based design, and I think it's
the right direction.  The Boost.Parameter arity thing may get onerous
(see comment below), and one thing you could do to work around it is
to just accept a mpl::map of tags, since I'm guessing your not using
Boost.Parameter's deduction functionality  (awesome and powerful, but
unnecessary in this case, I think).
I believe it *is* using deduced parameters. I don't recall the
terminology, but that's what Dave Abrahams called it when I described
the interface.
I'm referring to

http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/index.html#dedu...

and

http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/reference.html#...

In other words, are you specifying Boost.MPL predicates in your 
parameter specifications to deduce the binding of "bald" template 
parameters to tags/keywords?
...
...
On to more specific comments...
Documentation Comments
----------------------
* http://www.oakcircle.com/xint_docs/
"Why would I use it? ... Because it's fast."
Do you have any benchmarks to back this up?  Fast relative to...?
Fast relative to pretty much all home-grown code. At worst, it'll be
essentially the same speed as anything that doesn't use hand-written
assembly language functions.
Pretty bold claim, okay.
...
...
"Why would I use it? ... Because it has the features you need."
Included in here is "[c]ryptographically-secure prime number
generation".  This seems a little too domain-specific to be included
in this library.  Is there any reason the prime number generation
couldn't be factored out and, further, be bignum-implementation
agnostic? Boost.Prime, for example?
Certainly it could, but again, cryptography is the major reason for a
large-integer library. And prime-number generation is a very large part
of it.
Yes, I get that; I just don't understand why they can't be decoupled and 
still work harmoniously together.  Ideally, other bignum data structures 
should be usable with the cryptographic algorithms.
...
...
* http://www.oakcircle.com/xint_docs/namespaceboost_1_1xint.html
integer_t<...>  square (const integer_t<...>  n)
The complexity is just O(n^2).  If you want to more precise, don't
use O(...), just state the number of additions/subtractions and
multiplications.
Are you sure about that complexity? That function doesn't just multiply
a number by itself, it uses an algorithm explicitly designed for
squaring a number, which is more efficient. I think the complexity on
that one is correct.
Ummm...yes, the stated complexity is correct, but O(n(n+1)/2) = O(n^2). 
  So asymptotically speaking, it's not any faster than using the naive 
grade-school multiplication algorithm.  Like I said, unless you want to 
give a precise operation count, it's just O(n^2).
...
...
integer_t<...>  pow (const integer_t<...>  n, const integer_t<...>  e)
I suspect this complexity is, really, O( # of bits in the result ),
as the last one or two multiplies are what dominate.
I'm pretty sure the complexity on that one is correct as well.
Yeah, I don't think I'm right, either, but let's analyze this...

The asymptotic work in your loop in detail::pow is the same as the 
following loop (we'll assume that e >>= 1 is a constant-time operation):

for(; e != 0; e >>= 1)
     p = square(p);

Let p have np bits.  Each successive loop iteration *doubles* the number 
of bits in p, so the work for each successive iteration is np^2, 
(2np)^2, (4np)^2, ..., ((2^ne)*np)^2 (there are ne loop iterations, 
where ne is the number of bits in e).  The total work is then

(1 + 4 + 4^2 + ... + 4^ne) * np^2 = O(4^ne * np^2) = O(e^2 * np^2)

By my count, you're drastically underestimating the complexity ;)  Let 
me know if you agree with the above analysis, I could be wrong.
...
...
I was also thinking about adding an overload where the power is
specified as an unsigned long, but I don't know if it would actually
be worth it...
It might not be too difficult to make that function take any numeric
type. It's already a template function. But I suspect any speed
improvement would be unnoticeable.
Likely.
...
...
integer_t<...>  square_root (const integer_t<...>  n)
You should mention that the algorithm is just Newton iteration with
initial guess 2^floor(log2(n)/2) (where n is the input value).
What benefit would that add to the documentation? The average developer
using the library won't care what method is used. If they're
interested, can't they check the implementation source code to see?
There are many different algorithms to compute a square root; even users 
want to know what algorithms you're using so they can use your library 
with confidence.
...
...
Also, your complexity claim implies a bounded number of Newton
iterations, independent of the inputs, as each loop through the while
loop is O(n^2) (where n is the # of bits) due to the division... that
may be the case, and if so, you might want to back that up with a
reference or something, but I suspect you should have maybe a log(n)
in there (where n is the # of bits), as Newton has "only" quadratic
convergence.
That's one of the complexity values that I'm not sure of. I haven't
been able to find any reference that pins down a value for it, so that
was an educated guess.
Well, I stand by my observation here.  Given that Newton has quadratic 
convergence, I would quote the complexity as O(log(nn)*nn^2), nn = 
log2(input).
...
...
I believe there is an O(n) square_root algorithm that would be
asymptotically faster, but...I'd have to look it up...
That's the best one that I was able to come up with in the time I had.
If you can point me to a good description of a more efficient one, I'll
be happy to implement it.
I believe this is what I was thinking of, though the description is a 
bit terse:

http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Digit-by-digi...
...
...
int is_prime (const integer_t<...>  n, callback_t callback=no_callback)
Given the documentation, I think a more descriptive name, such as
is_probably_prime_rabin_miller, would be better.
Maybe, but a lot more typing for an arguable benefit. The documentation
specifies its limitations, I believe that's sufficient.
The "arguable" benefits are:
- the name is no longer misleading; is_prime doesn't actually check if 
its input is prime; and
- it is clear in what sense the input is determined to be probably prime.

Again, think about someone *reading* code that uses this is_prime function.
...
...
Also, I'm not sure how small "infinitesimally" small is...it seems
the wikipedia page only guarantees at most a 1/4^5 chance of a false
positive (given the current implementation), but in practice it
states the chance is much much less.
Applied Cryptography, Second Edition, page 260 states that "for a
256-bit n, the odds of error in six tests are less than 1 in 2^51", and
the odds get slimmer the larger the number is. (On the previous page,
it says of 2^50 that "[...] if you consider that the odds of the number
being composite are 300 million times less than the odds of winning
top prize in a state lottery, you might not worry about it so much.")
It also recommends only five tests, which is what the library uses.
Thanks for clearing that up.  I think such a reference should be added 
to the documentation.
...
...
Perhaps you should allow the iteration count (the "k" from the
wikipedia article) to be passed as a parameter as well (possibly
defaulted), for those that want more control on their primality test.
The documentation does say that if you want even greater certainty, you
can run it through that function multiple times. A handful of runs
through that should be more than sufficient for any purpose the library
will conceivably be put to.
Agreed.
...
...
http://www.oakcircle.com/xint_docs/structboost_1_1xint_1_1options_1_1threads...
"Note that even with this option, it is still your responsibility to
ensure that only one thread accesses a particular integer object at a
time."
You seem to imply that it is not thread safe under any circumstances
for multiple threads to even have *read* access to an xint::integer_t
object.  Is that the case, and if so, why?
Any number of threads can *read* an integer simultaneously. But if even
one is *writing* to the integer, all bets are off -- no other thread
can safely access it for any reason until the write is done. This is
extensively detailed on the page titled "The threadsafe Template
Parameter", off the index page.
Okay, then all I ask is that on the originally referenced page, it 
should be stated that multiple reader access is okay.  That's what I had 
guessed, but not what one might infer upon the way it is written.
...
...
http://www.oakcircle.com/xint_docs/classboost_1_1xint_1_1integer__t.html
template<...>  class boost::xint::integer_t<>
"You can specify several template parameters for the type. Most of
these parameters must be specified via helper classes defined in the
boost::xint::options namespace. See the list and descriptions there."
So that covers "most" of the template parameters...what about the
rest?
There's only one other, the custom allocator. It's mentioned in the
description on that page as well, at the bottom.
Mentioning it here, too, would be helpful.
...
...
Also, it would be helpful to document here what the default settings
are.
True, and thanks for pointing it out. Added to the to-do list.
Okay.
...
...
integer_t<...>  &  operator+= (const integer_t<...>  b)
integer_t<...>  &  operator-= (const integer_t<...>  b)
Why are these parameters passed by value?
This was mentioned in an earlier message. I didn't recall the reason
then, but I've looked into it: it's partly due to Dave Abraham's "Want
Speed? Pass by Value." article (possibly by misunderstanding, I'll
review that in the next few days to find out), and partly because the
internal COW system makes copies very cheap -- I suspect, as cheap as
pass-by-reference. Most of those pass-by-value functions should
probably be changed, we'll see what I come up with.
I'm pretty sure a COW copy is never cheaper than a reference copy, and 
probably always less cheap (given the reference count that needs 
incrementing).  Besides, COW isn't on by default...
...
...
int sign (bool signed_zero=false) const
I think you should separate the 2 different meanings of signedness
into 2 separate functions.  There are performance considerations, of
course, but it's also difficult to the casual reader to infer what
x.sign(true) is suppose to mean without looking it up.  A second
function called sign_with_signed_zero might be more appropriate.
It would be easy enough to add it, and it could occasionally help write
self-documenting code, so I've put it on the list.
sign_with_signed_zero might be a little long, but I don't have any 
alternate names at the moment, and you get the idea.
...
...
* http://www.oakcircle.com/xint_docs/threadsafe.html
"When using copy_on_write, identical integer objects are allowed to
share storage, which is more efficient for both CPU power and
memory..."
I'm (still) not sold that copy-on-write gives any noticeable
performance benefits in this library over move semantics except in
contrived examples, so I think some kind of justification of the above
claim would be justified.
We went through that, at exhaustive length, last year. The final
results are here:
<http://lists.boost.org/Archives/boost/2010/05/166036.php>.
Will review.
...
...
I'm also not a fan of the integer_t::integer_t(const integer_t&  b,
bool force_thread_safety) constructor; isn't there already an
"ensure_unique" member function already there (in the implementation)
that ensures an integer_t object has sole ownership of its storage?
Indeed, looking more at some source files, I would guess this is
_make_unique.  I would consider such a member function preferable to
this constructor overload, especially given how the mysterious boolean
parameter looks in client code (see integer_t::sign comment above).
Please see the aforementioned "The threadsafe Template Parameter" page.
That Boolean parameter will be used so rarely that I don't think it's
unreasonable to ask the reader of the client code to look it up if he
runs across it.
That page, or your explanation here, doesn't justify why you think this 
constructor overload is preferable to a make_unique member function. 
Sorry, "because that's the way it is now" doesn't count ;)
...
...
http://www.oakcircle.com/xint_docs/basic__types__and__includes_8hpp.html
"#define BOOST_PARAMETER_MAX_ARITY 6"
This is not guaranteed to actually have any effect.  Think about if
Boost.Parameter gets included before any xint header does.  Better
(and, yes, more onerous) to document that xint::integer_t uses
Boost.Parameter and hence the user must set this macro to at least 6.
Good point. Added to the to-do list.
If BOOST_PARAMETER_MAX_ARITY is already defined, you can check if it is 
at least 6, and if not, #error with a helpful error message.
...
...
Miscellaneous Comments
----------------------
Why is digit_t in the xint::detail namespace, when it is expected to
be part of the interface for purposes of supplying an allocator?
At the time I was writing the library, its definition hadn't been
nailed down, so it was subject to change without notice. Now it should
be stable, so I'll promote it to the xint namespace.
Okay.
...
...
How did you create your documentation?  Your documentation generator
may appreciate an acknowledgment in your documentation.
Ah, good catch. I didn't notice it was missing. I've re-added it for
the next update.
Good.
...
...
Also, regarding the complexity notation, it would probably be best to
state the complexity in terms of some other letter than n, so as not
to confuse n with the input value.  I know I did *not* do this in my
comments above, as I was trying to follow your notation, but it
easily gets confusing.  Or just use log(n) instead of n to mean the #
of bits or digits.
Probably so. Added to the list.
Okay.

- Jeff