[xint] Boost.XInt formal review

Vladimir Prus

2 Mar 2011 2 Mar '11

10:58 a.m.

Boosters, the formal review of the Boost.XInt library, by Chad Nelson, starts now and will run through March 11. The documentation for the current version is available at: http://www.oakcircle.com/xint_docs/ The source code is available via Subversion at: http://svn.boost.org/svn/boost/sandbox/xint or can be downloaded at: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=xint.zip Here are the important points: 1. All comments are appreciated. Even if you don't have the time for in-depth study of everything, comments on parts of the library are welcome. In particular, if you are already using logging in your applications and have specialized requirements, you might want to directly check how the proposed library handles them. 2. The reviews and all comments should be submitted to the developers list, boost@lists.boost.org and the email should have "[xint]" prefix in the subject to make sure it's not missed. The deadline for reviews is 23:59, March 11, PST, or 2:59, March 12, EST, or 7:59 March 12, UK time, 10:59, March 12 MSK 3. Please explicitly state in your review whether the library should be accepted. 4. The general review checklist is provided below: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With what compiler? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? Thanks, Volodya -- Vladimir Prus Mentor Graphics +7 (812) 677-68-40

Show replies by date

Vladimir Prus

2 Mar 2 Mar

1:09 p.m.

Vladimir Prus wrote:

...

Of course, the above should have read "Boost.XInt" instead of "logging". Sorry, -- Vladimir Prus Mentor Graphics +7 (812) 677-68-40

Christopher Jefferson

2:16 p.m.

One brief early comment, to help reviewers (and the author). The library has a small problem which prevents it compiling with clang (and possibly other compilers). I in no way view this as a serious issue, as getting code to work with many compilers is always a problem. On lines 85 and 89 of boost/xint/integer.hpp, remove the 'typename'. After this fix, I briefly used the library and used the documentation. It is indeed a large number library, I wrote some short programs and they worked well. I congratulate the author on making the easy uses of xint easy. I am sure others will comment on the advanced features of the library, but I am very happy to finally have a library where I can just declare a boost::xint::integer, and then have all normal operations work as I would expect, with no need for boiler-plate code. I also tried writing some incorrect code, and the error messages are clear and short, which is nice to see. I agree with the overall design decision chosen for thread-safety, and believe the system correctly balances clarity and efficiency. From my brief review as a user, I believe this library should be accepted. One serious (but small) issue which will have to be addressed before the library is accepted. The 'secure' flag at the moment I believe cannot be trusted to work. Compilers can, and do, optimise out memset if it can prove the memory will not be changed again. There are various solutions to this, I would probably suggest putting a "secure memset" into boost/detail so it can be used by other libraries, and specialised for different systems. See Microsoft's discussion of the issue here: http://msdn.microsoft.com/en-us/library/ms972826.aspx The short answer is: on Windows wrap the memset in: #pragma optimize("", off) #pragma optimize("", on) On linux, write a "stupid" memset which uses volatile pointers to force the code to be executed. I am happy to help clean this up, and produce a general solution, if required. On 2 Mar 2011, at 10:58, Vladimir Prus wrote:

...

Chad Nelson

3 Mar 3 Mar

12:43 a.m.

On Wed, 2 Mar 2011 14:16:38 +0000 Christopher Jefferson <chris@bubblescope.net> wrote:

...

One brief early comment, to help reviewers (and the author).

The library has a small problem which prevents it compiling with clang (and possibly other compilers) [...] On lines 85 and 89 of boost/xint/integer.hpp, remove the 'typename'. [...]

Thanks. I've tested this locally, and it caused no problems with MSVC or GCC, so it'll be in the next update.

...

One serious (but small) issue which will have to be addressed before the library is accepted. The 'secure' flag at the moment I believe cannot be trusted to work. [...]

Ouch. :-( Thanks for bringing that to my attention, I (obviously) wasn't aware of it.

...

I am happy to help clean this up, and produce a general solution, if required.

I'd happily accept a patch that corrects the problem. If you'd prefer not to, I'll run the fixed code by you before releasing it. -- Chad Nelson Oak Circle Software, Inc. * * *

Scott McMurray

1:31 a.m.

New subject: [Boost-users] [xint] Boost.XInt formal review

On Wed, Mar 2, 2011 at 06:16, Christopher Jefferson <chris@bubblescope.net> wrote:

...

I'm not convinced that either of those answers are correct, since neither prevents the OS from swapping the memory to disk while it contains secret data. To me, it seems that Boost isn't the place for anything that claims to be "secure", since the community is unsufficiently skilled in interpretive dance: see <http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html>, or specifically <http://2.bp.blogspot.com/_Zfbv3mHcYrc/Sre5JqBKZyI/AAAAAAAABn8/Op-n-e0JVaA/s1600-h/aes_act_3_scene_02_agreement_1100.png> :) ~ Scott

Marsh Ray

5:50 a.m.

New subject: [Boost-users] [xint] Boost.XInt formal review

On 03/02/2011 07:31 PM, Scott McMurray wrote:

...

(Or your cloud provider from migrating your whole OS image across a network.)

...

+1 There are some not-entirely-unheard of operating systems that emit detectable patterns from /dev/random. Libraries like OpenSSL dedicate large amounts of code to secure random generation for this sort of reason. But they're still vulnerable to a Debian maintainer changing something he doesn't understand. The RSA example is a great way to demonstrate bigint libraries - and a terrible thing to actually use it for. I suggest any wording suggesting "cryptographically secure" be avoided. Even dedicated purpose cryptographic libraries written and maintained by experts are still weeding out the tiny bugs and timing and cache side-channel attacks years later. - Marsh

Chad Nelson

3:42 p.m.

New subject: [Boost-users] [xint] Boost.XInt formal review

On Wed, 02 Mar 2011 23:50:16 -0600 Marsh Ray <marsh@extendedsubset.com> wrote:

...

[...] The RSA example is a great way to demonstrate bigint libraries - and a terrible thing to actually use it for.

Maybe, maybe not. A lot of viable uses for public-key encryption don't require government-level security.

...

That's why the XInt-provided convenience class is called strong_random_generator, not secure_random_generator. :-) It's simply an interface to the OS-provided generator, which is supposed to be cryptographically secure. I've added additional notes in a couple prominent places in the documentation for the next release, explicitly pointing out that its cryptographic security depends on that of the underlying generator. -- Chad Nelson Oak Circle Software, Inc. * * *

Chad Nelson

3:14 p.m.

New subject: [Boost-users] [xint] Boost.XInt formal review

On Wed, 2 Mar 2011 17:31:07 -0800 Scott McMurray <me22.ca+boost@gmail.com> wrote:

...

A known and documented problem: <http://www.oakcircle.com/xint_docs/structboost_1_1xint_1_1options_1_1secure.html>

...

Perhaps an alternate name for that option, then. One that wouldn't be too much longer or too many words, but also wouldn't be misinterpreted as providing true security... perhaps more_secure? It requires a little less typing, and is less frightening, than less_insecure. ;-) -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

3:23 p.m.

New subject: [Boost-users] [xint] Boost.XInt formal review

From: boost-bounces@lists.boost.org wrote:

...

How about naming it for what it does: overwrite_on_destruction or similar. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

4:15 p.m.

New subject: [Boost-users] [xint] Boost.XInt formal review

On Thu, 3 Mar 2011 10:23:55 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

How about naming it for what it does: overwrite_on_destruction or similar.

Hm... longer, but *much* more descriptive. I like it. :-) -- Chad Nelson Oak Circle Software, Inc. * * *

Christopher Jefferson

3:30 p.m.

New subject: [Boost-users] [xint] Boost.XInt formal review

On 3 Mar 2011, at 15:14, Chad Nelson wrote:

...

Yes, of course there are other problems, such as timing attacks ( http://en.wikipedia.org/wiki/Timing_attack ) , which xint does not, and should not, try to stop. Chris

Mathias Gaunard

2 Mar 2 Mar

5:18 p.m.

On 02/03/2011 11:58, Vladimir Prus wrote:

...

This is not a review, just a few comments from going through the docs quickly. - XInt uses COW and passes everything by value. COW is useless in the presence of move semantics (which the library seems to partially implement), and passing by value everywhere unless you mean to copy is a bad idea. Taking by value can be useful to implement operator+ in terms of operator+=, but the library doesn't seem to do that. Also, it seems that the object always contains the copy count and a readonly boolean, which total 2 words, even if COW is disabled. - from what I can see, the statement xint::integer b = a0 + a1 + ... + aN; requires n copies of the underlying memory buffer, even though it is possible to do zero copy without even using expression templates. - unary operator+, operator++ and operator-- return a reference to the object. I don't think that's a good idea, since regular integers don't behave like this. - I would quite like to get an interface to iterate the integer digit per digit or byte per byte. - XInt cannot do SBO - Fixed-width integers still store the size etc. Ideally, xint::integer<fixed_size<32> > should have no more overhead than int32_t.

Chad Nelson

3 Mar 3 Mar

1:28 a.m.

On Wed, 02 Mar 2011 18:18:33 +0100 Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:

...

This is not a review, just a few comments from going through the docs quickly.

- XInt uses COW and passes everything by value. COW is useless in the presence of move semantics (which the library seems to partially implement), and passing by value everywhere unless you mean to copy is a bad idea.

Thank you for your comments. Boost.Move wasn't an accepted Boost library at the time I wrote that code, so I didn't feel that I could rely on it. As such, it's only there as an option at present. As it turns out, it's *still* not an official Boost library (though through no fault in it or of its author), so I think my caution is justified. I don't recall the reason for passing by value offhand, but I believe Boost.Move was part of it. I'll research that before the next update.

...

[...] Also, it seems that the object always contains the copy count and a readonly boolean, which total 2 words, even if COW is disabled.

Eliminating sixteen bits of overhead, on an integer that is almost always going to be larger than 64 bits (and often *much* larger), seems to be a premature optimization at best.

...

- from what I can see, the statement xint::integer b = a0 + a1 + ... + aN; requires n copies of the underlying memory buffer, even though it is possible to do zero copy without even using expression templates.

It could be optimized for such statements, yes. Can you provide a real-world case where more than two or three integers would be summed like that, and is common enough to justify the extra work and code required for it?

...

- unary operator+, operator++ and operator-- return a reference to the object. I don't think that's a good idea, since regular integers don't behave like this.

<http://stackoverflow.com/questions/465517/overloaded-increments-return-value>

...

- I would quite like to get an interface to iterate the integer digit per digit or byte per byte.

Why? If you need it for exchanging information with another system or another program, the to_binary function should do the trick. If you're looking to make changes to the internals of an integer_t without going through its interface, it's not going to happen.

...

- XInt cannot do SBO

A large-integer math library would be able to take advantage of SBO only rarely, and every number object would have to carry around the overhead, at least in all the SBO designs I've ever seen. If you were complaining about sixteen bits of sometimes-unneeded overhead, I'd hate to see how you reacted to, for instance, 128.

...

- Fixed-width integers still store the size etc. Ideally, xint::integer<fixed_size<32> > should have no more overhead than int32_t.

If the design were solely or even primarily fixed-width, I'd agree. As it's first and foremost a variable-width system, and the fixed-width behavior is secondary, I believe writing and maintaining code specifically to eliminate a few extra bits for fixed-width integers would not be justified. -- Chad Nelson Oak Circle Software, Inc. * * *

Michael Caisse

1:37 a.m.

On 03/02/2011 05:28 PM, Chad Nelson wrote:

...

Boost.Move is an accepted library and is expected to be in the 1.47 release. <http://lists.boost.org/boost-announce/2011/02/0283.php> -- Michael Caisse Object Modeling Designs www.objectmodelingdesigns.com

Chad Nelson

3:17 p.m.

On Wed, 02 Mar 2011 17:37:02 -0800 Michael Caisse <boost@objectmodelingdesigns.com> wrote:

...

Yes, I've been following its progress. Once there's a final version in an official Boost release, I'll see about using it more prominently in XInt too. -- Chad Nelson Oak Circle Software, Inc. * * *

Beman Dawes

2:12 a.m.

On Wed, Mar 2, 2011 at 8:28 PM, Chad Nelson <chad.thecomfychair@gmail.com> wrote:

...

Major compiler vendors have already been shipping rvalue reference enabled compilers for a year or more. So any library that doesn't take advantage of move semantics is already starting to look dated. And any library that wants to make the jump from Boost to TR2 or the C++ standard itself is going to have to use all C++0x features that seem appropriate. So please don't let your design be dictated by Boost.Move's status. --Beman

Mathias Gaunard

1:46 p.m.

On 03/03/2011 02:28, Chad Nelson wrote:

...

I don't recall the reason for passing by value offhand, but I believe Boost.Move was part of it. I'll research that before the next update.

You quote Dave's "Want Speed? Pass by value" article as motivation. Yet it feels like the conclusion of that article, which can be summed up as the following, wasn't understood: If you intend on making a copy of an argument on automatic storage, prefer letting the compiler do that by passing by value rather than passing by const-reference and doing the copy yourself.

...

Eliminating sixteen bits of overhead, on an integer that is almost always going to be larger than 64 bits (and often *much* larger), seems to be a premature optimization at best.

Consider a language like Python. All integers can be arbitrarily large, yet in practice, they rarely are. But then, that's one application. Can your library allow me to choose a memory layout optimal for the application of my choosing? Probably not. How about you let me choose my own layout and let me interface with your algorithms?

...

...
- from what I can see, the statement xint::integer b = a0 + a1 + ... + aN; requires n copies of the underlying memory buffer, even though it is possible to do zero copy without even using expression templates.

It could be optimized for such statements, yes. Can you provide a real-world case where more than two or three integers would be summed like that

It's just a degenerated example easy to play with, since it involves a single function, N+1 variables, and does N-1 unneeded copies. And yes, people don't simply do binary operations and store them into variables each time, they even *combine* operations and even create temporaries.

...

and is common enough to justify the extra work and code required for it?

There is no extra code required, on the contrary, it requires *less code* than your current implementation. Like I said in a piece you didn't quote, if you implement operator+ in terms of operator+= using NRVO or move semantics you can get no copy going on. Now if you want to make it work with expression templates, that would be even cooler. This would solve certain issues with the design of algorithms that work with arbitrary types too.

...

...
- unary operator+, operator++ and operator-- return a reference to the object. I don't think that's a good idea, since regular integers don't behave like this.

<http://stackoverflow.com/questions/465517/overloaded-increments-return-value>

My bad, ++i with i an int does return an lvalue and not an rvalue.

...

...
- I would quite like to get an interface to iterate the integer digit per digit or byte per byte.

Why?

- So that the library is easily extendable - So that the algorithms only require a range of digits as input if that's all they need, allowing them to work with arbitrary types of my choosing without having to copy or convert to your types The main drawback of the library I can see for now is clear lack of genericity. For me to like it, it needs concepts, generic algorithms, and the ability to interwork with other types through retroactive modeling. There is too much coupling between the algorithms and the data, since I have to use your data container. That forces me to use some specific layout that may not be good for my use case, and to copy over the data multiple times when converting from/to your types even though that might be entirely unnecessary. As far as I can see (I don't know too much about the algorithms behind that kind of library), any range could be treated as a big integer, and all your algorithms are intrinsically capable of dealing with all of them.

...

If you need it for exchanging information with another system or another program, the to_binary function should do the trick. If you're looking to make changes to the internals of an integer_t without going through its interface, it's not going to happen.

If it's not extendable, nor generic, nor interoperable, what does it bring that other existing bigint libraries do not? The boost logo and license? There are big players in the field, and if Boost accepts a library that does something like that, I think it should have significant advantages over the competition. Imagine something similar to the mpn component of the GMP library, but with a generic and easy-to-use C++ interface. Through genericity, you could make the algorithms accept ranges of integers and have them work regardless of digit size, with the digit chosen at the discretion of the user. If the user provides a SIMD type as a digit, you'll get optimized SIMD code for free, for example, in a completely externalized way. Now *that* would be awesome. Something that doesn't seem much better than the high-level C++ interface provided for the mpz component of GMP? Not so much.

...

...
- XInt cannot do SBO

A large-integer math library would be able to take advantage of SBO only rarely, and every number object would have to carry around the overhead, at least in all the SBO designs I've ever seen. If you were complaining about sixteen bits of sometimes-unneeded overhead, I'd hate to see how you reacted to, for instance, 128.

"sometimes-unneeded"? COW is disabled by default, and should never be used anyway, since it is a hack that only makes sense in a language without move semantics. And this is mostly a problem due to bad code design, there is no reason why you couldn't be rid of those two words in the non-COW case, and why you couldn't be rid of the buffer size in the fixed-width case. See, if data and algorithms were clearly separated, it would be much easier. Using SBO can make sense, but making objects unnecessarily large with data that is never used is just a bug.

...

...
- Fixed-width integers still store the size etc. Ideally, xint::integer<fixed_size<32> > should have no more overhead than int32_t.

If the design were solely or even primarily fixed-width, I'd agree. As it's first and foremost a variable-width system, and the fixed-width behavior is secondary, I believe writing and maintaining code specifically to eliminate a few extra bits for fixed-width integers would not be justified.

Arbitrary-sized yet fixed integers and big integers have enough code in common that it would be a shame to design a modern library without thinking of how both can play into its algorithms. Something else, I went through a code a bit, and I saw virtual inheritance. Just why do you need dynamic dispatch in a design where all information is purely static?

Chad Nelson

4:10 p.m.

On Thu, 03 Mar 2011 14:46:41 +0100 Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:

...

On 03/03/2011 02:28, Chad Nelson wrote:

...
I don't recall the reason for passing by value offhand, but I believe Boost.Move was part of it. I'll research that before the next update.

You quote Dave's "Want Speed? Pass by value" article as motivation. [...]

Yes, I rediscovered that in the documentation last night, and I plan to reread the article to see whether it really applies or not, or if there are other reasons for it too.

...

...
Eliminating sixteen bits of overhead, on an integer that is almost always going to be larger than 64 bits (and often *much* larger), seems to be a premature optimization at best.

Consider a language like Python. All integers can be arbitrarily large, yet in practice, they rarely are.

Not the same case at all. XInt isn't a language, it's a library that's explicitly designed for very large numbers. If someone didn't need numbers larger than the built-in types can handle, they wouldn't use it.

...

Can your library allow me to choose a memory layout optimal for the application of my choosing? Probably not. How about you let me choose my own layout and let me interface with your algorithms?

If you want that sort of flexibility, you'd probably be better off with something like GMP. XInt isn't designed for that, and in my opinion, shouldn't be.

...

...
...
- I would quite like to get an interface to iterate the integer digit per digit or byte per byte.

Why?

- So that the library is easily extendable

What would you want to do to the internals of it that it doesn't already provide, and that I wouldn't be willing or even eager to add?

...

- So that the algorithms only require a range of digits as input if that's all they need, allowing them to work with arbitrary types of my choosing without having to copy or convert to your types

If it exposed the internal details of a number, then I'd never be able to change the way it works without breaking existing client code. That would be a much worse design.

...

[...] If the user provides a SIMD type as a digit, you'll get optimized SIMD code for free, for example, in a completely externalized way. Now *that* would be awesome. [...]

Yes, it would. But it would come at a cost.

...

...
...
- XInt cannot do SBO

A large-integer math library would be able to take advantage of SBO only rarely, and every number object would have to carry around the overhead, at least in all the SBO designs I've ever seen. If you were complaining about sixteen bits of sometimes-unneeded overhead, I'd hate to see how you reacted to, for instance, 128.

"sometimes-unneeded"? COW is disabled by default,

Disabled by default on *external* numbers. It's still used internally, where it's completely safe and much faster to do so.

...

and should never be used anyway, since it is a hack that only makes sense in a language without move semantics.

Which C++ still is, until C++0x is official and built into almost every platform's compilers. Which won't happen for a number of years yet.

...

And this is mostly a problem due to bad code design, there is no reason why you couldn't be rid of those two words in the non-COW case, and why you couldn't be rid of the buffer size in the fixed-width case.

Your opinion is noted.

...

Something else, I went through a code a bit, and I saw virtual inheritance. Just why do you need dynamic dispatch in a design where all information is purely static?

For the policy-based options. -- Chad Nelson Oak Circle Software, Inc. * * *

Mathias Gaunard

8:28 p.m.

On 03/03/2011 17:10, Chad Nelson wrote:

...

Not the same case at all. XInt isn't a language, it's a library that's explicitly designed for very large numbers. If someone didn't need numbers larger than the built-in types can handle, they wouldn't use it.

So you're already limiting the application area of XInt. The problem is that the application area you're restricting it to doesn't seem to be the one that is the most likely to require arithmetic with infinite precision.

...

...
Can your library allow me to choose a memory layout optimal for the application of my choosing? Probably not. How about you let me choose my own layout and let me interface with your algorithms?

If you want that sort of flexibility, you'd probably be better off with something like GMP. XInt isn't designed for that, and in my opinion, shouldn't be.

GMP doesn't allow what I want at all. It can't really do, it's in C.

...

...
- So that the library is easily extendable

What would you want to do to the internals of it that it doesn't already provide, and that I wouldn't be willing or even eager to add?

There are hundreds of functions that you can use with integers, each with their own algorithms documented in various research papers. Do you really claim that your library can provide all of them? Say I want to implement the gammaln function (gammaln(x) = ln(gamma(x))). Assuming you did implement good ln and gamma functions (which you do not, all you have is a O(n^3) factorial), it is still interesting to compute gammaln directly because it uses less memory. And before you wonder the point of that function, know that it is typically used to compute binomial coefficients (and combinatorics are certainly big users of big integers). Now, cool stuff, if you were using expression templates, you could recognize that when doing ln(gamma(x)) you should really call gammaln(x). Clearly, this is becoming too big of a thing for a monolithic library to handle. That's why you need to be generic and modular, and separate concerns as much as possible.

...

...
- So that the algorithms only require a range of digits as input if that's all they need, allowing them to work with arbitrary types of my choosing without having to copy or convert to your types

If it exposed the internal details of a number, then I'd never be able to change the way it works without breaking existing client code. That would be a much worse design.

You need to come up with a concept that can be used to interface the data and the algorithms. Now that you've written the algorithms, it's easy: just look at their requirements, and define your concept from then. See <http://www.generic-programming.org/about/intro/lifting.php>

...

...
[...] If the user provides a SIMD type as a digit, you'll get optimized SIMD code for free, for example, in a completely externalized way. Now *that* would be awesome. [...]

Yes, it would. But it would come at a cost.

The design I am proposing would allow you to "outsource" the low-level optimizations and concentrate on the algorithms. Do you not see the value in this?

...

Disabled by default on *external* numbers. It's still used internally, where it's completely safe and much faster to do so.

Why would you need to unnecessarily copy around internal temporary numbers?

...

Which C++ still is, until C++0x is official and built into almost every platform's compilers. Which won't happen for a number of years yet.

Emulation of move semantics in C++03 with Boost.Move is functional. Sure, it came a bit late in the project, so I can understand why you only did simplistic support without delving too much into it. Also, using expression templates, you could also obtain speed improvements similar to that move semantics can bring. Speaking of expression templates, I know that an bigint implementation made at STMicro (for use in embedded environments) uses expression templates to compute the worst-case size needed for an expression so that it can preallocate the result buffer accordingly, so this could also have other interesting uses.

...

...
Something else, I went through a code a bit, and I saw virtual inheritance. Just why do you need dynamic dispatch in a design where all information is purely static?

For the policy-based options.

I don't see how that requires virtual inheritance.

Chad Nelson

4 Mar 4 Mar

7:39 a.m.

On Thu, 03 Mar 2011 21:28:35 +0100 Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:

...

That makes no sense. It's a large-integer library. How does designing it specifically for large integers make it unsuitable for its purpose?

...

Do they require specialized bit-twiddling? Or are they arithmetic operations that can be described in terms of functions that the library already provides?

...

Certainly, but it doesn't outweigh the cost.

...

Yes, it's always much easier to attack something than to build it in the first place. -- Chad Nelson Oak Circle Software, Inc. * * *

Mathias Gaunard

5 Mar 5 Mar

2:53 p.m.

On 04/03/2011 08:39, Chad Nelson wrote:

...

On Thu, 03 Mar 2011 21:28:35 +0100 Mathias Gaunard<mathias.gaunard@ens-lyon.org> wrote:

...
On 03/03/2011 17:10, Chad Nelson wrote:

...
Not the same case at all. XInt isn't a language, it's a library that's explicitly designed for very large numbers. If someone didn't need numbers larger than the built-in types can handle, they wouldn't use it.

So you're already limiting the application area of XInt.

The problem is that the application area you're restricting it to doesn't seem to be the one that is the most likely to require arithmetic with infinite precision.

That makes no sense. It's a large-integer library. How does designing it specifically for large integers make it unsuitable for its purpose?

It's not tailored to particularly large integers (well, in truth, using it with very large integers would be catastrophic, because the algorithms don't have good complexities). So it's a library for general-purpose integers that can be large up to a point. If you're already alienating the cases where the integers are often small and rarely big, there is not much of an applicability range left for your library. Shall it be renamed to "the monolithic thread-safe COW library for integers between 100 and 1000 digits"? What's the applicability range of that?

...

...
There are hundreds of functions that you can use with integers, each with their own algorithms documented in various research papers. Do you really claim that your library can provide all of them?

Do they require specialized bit-twiddling?

They're often defined in terms of digits, yes.

...

Or are they arithmetic operations that can be described in terms of functions that the library already provides?

Well, in a discrete world, you can implement everything on top of the basic arithmetic operations. But that wouldn't be efficient. I've already given you a very simple example that demonstrates how it can be best to implement a composition of two functions on its own rather than evaluate the functions one after the other. Even for multiplication, there are a lot of different algorithms. The best algorithm to use depends on the situation, a fast fourier transform is probably the best when dealing with large values for example. It seems however that you library is stuck with the naive O(n^2) implementation. A new multiplication algorithm was created fairly recently in 2007, so you could want to add new algorithms to your library at any time. Extension is important. Take a look at Boost.Graph; several contributors add new algorithms to it every so often. But if it's not extensible, it cannot happen. On a side note, every algorithm that depends on multiplication should take a functor that specifies which multiplication algorithm to use (with a suitable default). This way, I can provide my multiplication that uses an optimized FFT routine to your algorithms, without requiring your library to contain all possible optimized implementations of multiplication.

...

Certainly, but it doesn't outweigh the cost.

What cost? The cost of reorganizing your library entirely? I thought we agreed you'd have to do that anyway.

...

...
...
...
Something else, I went through a code a bit, and I saw virtual inheritance. Just why do you need dynamic dispatch in a design where all information is purely static?

For the policy-based options.

I don't see how that requires virtual inheritance.

Yes, it's always much easier to attack something than to build it in the first place.

This is a review, not a personal attack. I'm asking you why you need to do a specific thing, since I see no reason to, and you reply with that kind of comment? Am I to understand you don't wish to continue a discussion with someone that is obviously going to give you a "no" vote? I have built libraries that used so-called policy-based designs, and never required virtual inheritance. So I honestly see no reason why you are doing that, and would welcome to be enlightened or to explain to you how you can circumvent the problems that required you to use this if possible. Likewise, I would also warmly welcome answers to several questions I have asked you that you have ignored. What are the advantages of your library over the myriad of existing ones? The reason I'm asking you that is because, from my possibly narrow view, you do not compare favourably to them. I would gladly be corrected on this topic.

Chad Nelson

4:02 p.m.

On Sat, 05 Mar 2011 15:53:30 +0100 Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:

...

It can easily handle the range of values that are used in public-key encryption, which exceed the number of atoms in the universe by many orders of magnitude. If you need far larger values than that, you probably want GMP.

...

Only for the present. Which is something I've mentioned several times.

...

That complicates the interface significantly, and would be completely unnecessary once the library matures enough to have faster multiplication algorithms.

...

The cost of freezing the internals at a 1.0 level, or of breaking client code when the interface has to change. Which is something else I've stated several times, which leads me to believe that you're either ignoring any part of the conversation that doesn't focus explicitly on your opinions, or that you're ignoring everything else in an attempt to intimidate me into redesigning the library to suit you.

...

I don't know what culture you grew up in, but mocking someone's words, questioning his competence, and finding any excuse to belittle his work is incredibly offensive in mine. And that's the content of at least half of your messages to me, when you're not simply expressing the unbelievably arrogant and self-centered assumption that you know the best way to do something that you've never attempted, but that I've put a hell of a lot of work into developing, and that anything I've done differently is automatically wrong. On the small chance that you really don't know how insulting your messages are, here's some advice: if you want your opinions to be taken seriously by anyone with a spine, try a little respect for the intelligence of the person on the other side of the conversation, and assume that he's competent and does things for a reason until proven otherwise, even if that reason isn't perfect. It will make people a lot more receptive to your ideas.

...

At the moment, I don't think you'd give it a no-vote, because I don't think you'd bother actually reviewing the library unless it's out of spite.

...

The only competing library that matters, so far as I can see, is GMP. And GMP has licensing issues, which is the main reason why I started writing XInt in the first place -- I didn't originally write it for Boost, I wrote it for use in one of my own commercial projects. That's sufficient reason right there, but if you want more: there is no large-integer library with even a half-decent C++ interface, so far as I've seen. I *live* in C++, and I passionately detest working with poorly-designed programming interfaces. The old saw is "if you want something done right, do it yourself," so I did. It's a sentiment I've been sorely tempted to recommend to you as well, on a number of recent occasions. -- Chad Nelson Oak Circle Software, Inc. * * *

Phil Endecott

5:47 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

Chad Nelson wrote:

...

Chad has posted a number of outbursts like this in the last few days. It seems to me that they are unjustified: this particular response was, according to how Chad himself has quoted the thread, in response to the comment "I don't see how that requires virtual inheritance". I don't think anyone has yet agreed with Chad that virtual inheritance is the right solution here; it certainly looks mighty odd to me. Yet that criticism sends Chad off the rails. We can't go on like this. I propose that the review be stopped now. There should already be enough input for the Review Manager to make a decision. If we carry on, this list will turn into the sort of "flame fest" that we all know happens elsewhere, but not normally in Boost. Regards, Phil.

Gordon Woodhull

6 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Mar 5, 2011, at 12:47 PM, Phil Endecott wrote:

...

We can't go on like this. I propose that the review be stopped now. There should already be enough input for the Review Manager to make a decision. If we carry on, this list will turn into the sort of "flame fest" that we all know happens elsewhere, but not normally in Boost.

I disagree. There is a personal quarrel between Chad and Mathias that carried over from the Unicode string discussion, which they need to resolve. Otherwise, I think the review has been civil and should continue. In particular, I don't think the debate between a fully encapsulated class and a division between algorithms and data has been fully argued by both sides. IMO (not being much of a math guy but big on abstractions) this is the most important issue. Cheers, Gordon

Mathias Gaunard

6:38 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On 05/03/2011 19:00, Gordon Woodhull wrote:

...

I hadn't even noticed it was the same person.

Chad Nelson

6:24 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Sat, 05 Mar 2011 17:47:09 +0000 "Phil Endecott" <spam_from_boost_dev@chezphil.org> wrote:

...

Most of them probably are, which is inexcusable, and I offer public apology to everyone on this list for those. I hadn't realized how emotionally attached I was to this library until I looked at my reactions over the past few days. A few of them, however, are entirely justified. Most especially the one you quoted. Maybe Mathias *doesn't* realize how offensive his messages seem. If so, telling him gives us a chance to clear the air, and tells him why other people might react badly to him too. Ignoring it any longer would be a disservice to both of us.

...

That *person* sends me off the rails, for exactly the reasons you quoted. I realize that virtual inheritance is not the right solution, and CRTP has been suggested by two people (Gordon Woodhull, in a message I haven't had a chance to answer yet, and someone off-list who might wish to remain nameless) as an alternative, which I plan to use.

...

Although ending it now would be favorable to the library (I record four reviews so far, three of which have been positive), I would prefer to let it continue. I'm getting a lot of feedback that will improve the library. Rest assured, if we can't work it out soon, I'll have my e-mail program delete any message from him unread... rude, but an effective way to prevent any further escalation. -- Chad Nelson Oak Circle Software, Inc. * * *

Mathias Gaunard

7:11 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On 05/03/2011 18:47, Phil Endecott wrote:

...

Chad has posted a number of outbursts like this in the last few days.

I am partially responsible for this. I used too many superlatives because I wanted to put Chad in front of the problems in his library -- or rather what I consider to be problems.

...

Surely that's not really needed. I will not be taking part in this discussion any more to avoid it to further degenerate.

Joachim Faulhaber

6 Mar 6 Mar

12:58 a.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

2011/3/5 Phil Endecott <spam_from_boost_dev@chezphil.org>:

...

Chad is having a hard time in this review. It is not always easy to digest loads of criticism and not to take things personal. As far as I can see, Chad is very passionate about contributing and willing to learn and to admit shortcomings in his project. I've seen other people derail under "review pressure" on this list. We're all humans, So please have mercy ;-) Joachim -- Interval Container Library [Boost.Icl] http://www.joachim-faulhaber.de

Chad Nelson

1:27 a.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Sun, 6 Mar 2011 01:58:16 +0100 Joachim Faulhaber <afojgo@googlemail.com> wrote:

...

Chad is having a hard time in this review. [...] We're all humans, So please have mercy ;-)

Chad, if you'll pardon the third-person reference, is learning just how human he is. A needed lesson, but like most such lessons, not all that pleasant on the learner. Nothing like making a fool of yourself on a world-wide mailing list. Again. :-( As a silver lining, at least I'm getting practice at apologizing gracefully. And trying to minimize the future need to do so. Thanks for the plea for clemency. -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

7 Mar 7 Mar

1:26 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

Joachim Faulhaber wrote:

...

While Chad has apologized for such outbursts, almost profusely so, I do think it important to raise a point that's been troubling me for some time regarding Chad's participation on the list, not just during this review. Chad's default reaction to criticism has too frequently been to lash out, rather contrary to the advice he gives above. Consider even the tone of his response quoted above. He assumed that Mathias was mocking him, questioning his competence, and finding excuses to belittle XInt. I didn't see that in Mathias' posts. My impression is that Chad thinks his months of hard work and rewriting means that any criticism is an affront to his superior design and implementation skills. During the development of XInt, however, Chad learned a great many things about modern C++ from members of this list. He has learned about a number of new (to him) techniques and tried to incorporate them into XInt. He has very recently been introduced to CRTP, his partial misapprehension of virtual inheritance, expression templates, and techniques for minimizing allocations. I expected a humbler reaction to the design criticisms given the number of things those on this list obviously know about modern C++. (I don't mean that I expect him to just accept everything others throw at him all while eating humble pie, of course.) This same problem arose in response to my review comments regarding the documentation. Because he considered what he had done to be superior to many libraries, including several existing Boost libraries, my criticisms were mostly, as I took his response, mere matters of style. My intention was not that he would, of necessity, accept my suggestions unaltered, but rather recognize that if one person found concerns, points of confusion, or omissions, that the documentation could and should be improved to save others the same trouble.

...

I think we've observed at least a partial correction to the problem, so there's no need to stop, but I am still concerned. I hope Chad will think seriously about his tone and tendency to interpret others negatively, and make alterations to his established pattern on this list in order to become a productive member of the Boost community. Tone is very hard to interpret in the written word, particularly given that English is a second language to so many, and that many native speakers learn the language poorly. We all must take care to interpret others in the best light possible and express concerns when negativity is suspected. I don't expect perfection because I'm not perfect. I do expect a pattern of decency and civility, however (which I hope others see in my own participation).

...

I think it would be wise to discuss this on the web site as part of the library submission process. We should suggest that would-be submitters review the list archives to see the tenor and content of past reviews. We should, perhaps, refer them to specific reviews so they can see how critical reviews can be. That would go a long way toward preparing them for the (almost) inevitable in their own review. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Dave Abrahams

3:29 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Mon, Mar 7, 2011 at 8:26 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:

...

While Chad has apologized for such outbursts, almost profusely so, I do think it important to raise a point that's been troubling me for some time regarding Chad's participation on the list, not just during this review. Chad's default reaction to criticism has too frequently been to lash out, rather contrary to the advice he gives above.

Robert, I have no sense at all of Chad's posting history here, so I am not agreeing or disagreeing with your assessment above. That said, I have always viewed the review process as—at least in part—a test of the submitter's constitutional ability to participate effectively in the Boost community throughout the life of his library. These kinds of factors should IMO be taken into serious consideration by reviewers and the review manager in determining whether the library is to be accepted. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Chad Nelson

4:05 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Mon, 7 Mar 2011 08:26:04 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

...
...
[...] Chad has posted a number of outbursts like this in the last few days.

While Chad has apologized for such outbursts, almost profusely so, I do think it important to raise a point that's been troubling me for some time regarding Chad's participation on the list, not just during this review. Chad's default reaction to criticism has too frequently been to lash out,

I'm confident to the point of arrogance. It's a personality flaw that I know and detest in myself, and I've worked for years to mute it. I'm also by far the best programmer that I'd ever met before I joined this list, which exacerbates the problem greatly because I've never before had any reason to be humble in that area. To make things worse, XInt is the product of not just passion, but obsession. No completely sane human works on something like that for twelve or more hours a day, seven days a week, for months on end. There were several factors that contributed to that, but they're irrelevant. So when I joined this list and offered XInt, I subconsciously expected to be welcomed with open arms and lauded for my contribution. Instead, the masterpiece I was so proud of was torn apart and left in a bloody heap. Six times. None of that excuses my behavior, only hopefully explains it. I didn't realize how I was reacting until a few days ago, and I've tried very hard to divorce my emotions from the library and deal with it objectively since then. I think I've been fairly successful, but that may be unwarranted confidence talking again.

...

rather contrary to the advice he gives above. [...]

The traits you hate in yourself are always the ones you react to most strongly, and negatively, in others. I long ago learned that there's no point in crying over spilt milk. I'm extremely embarrassed by the episode, both this part and last year's, but I'm not going to beat myself up over it. I can only hope that others who witnessed it will forgive me and put it behind them as well. If not, I'll live with the consequences.

...

During the development of XInt, however, Chad learned a great many things about modern C++ from members of this list. [...]

For which I'm grateful.

...

I expected a humbler reaction to the design criticisms given the number of things those on this list obviously know about modern C++. (I don't mean that I expect him to just accept everything others throw at him all while eating humble pie, of course.)

The people on this list vary in skill, and often even several people who seem equally skilled have given me exactly the opposite advice -- your advice to make the fixed-length integers an entirely separate class, for instance, directly contradicts the instructions that I was given to unify the three separate integer types that I had earlier (standard, nothrow, and fixed-length), which is exasperating. Even more so because I *had* it that way, and was told I had to change it. Humble, especially in matters of programming, doesn't come naturally to me, and things like that just reinforce my subconscious opinion that as its creator, my vision for the library should take precedence. As I said, that's a personality flaw. It has sometimes been useful, but I strongly dislike it, and I'm trying to overcome it. I suspect I'll still be trying up to the day I die.

...

This same problem arose in response to my review comments regarding the documentation. Because he considered what he had done to be superior to many libraries, including several existing Boost libraries, my criticisms were mostly, as I took his response, mere matters of style. [...]

Your criticisms were overwhelmingly correct. I've adopted almost all of them already -- if you'd like to see the current state of the documentation, I'll send it to you -- and noted the larger ones for when I have time to do them. But your criticisms of certain parts, especially the top half of the index page, *are* matters of writing style. That part has a specific and non-technical purpose, and I didn't realize that people didn't recognize that until last night. Please refer to the last quarter or so of last night's message, <http://permalink.gmane.org/gmane.comp.lib.boost.devel/215787>, for details. Your saying that my "complete and carefully maintained documentation" comment was excessive stung, regardless of whether it was correct or not. I reacted badly to that. I'm sorry that I did, and I publicly apologize if it offended you. I still feel that the statement is accurate, and that the additional sections suggested by you and others will only enhance what was already very complete documentation, but that is merely an opinion.

...

I think we've observed at least a partial correction to the problem, so there's no need to stop, but I am still concerned. I hope Chad will think seriously about his tone and tendency to interpret others negatively, and make alterations to his established pattern on this list in order to become a productive member of the Boost community. [...]

If I don't, I'd appreciate being told, hopefully privately. It's not always easy to see things in yourself.

...

...
Chad is having a hard time in this review. It is not always easy to digest loads of criticism and not to take things personal. As far as I can see, Chad is very passionate about contributing and willing to learn and to admit shortcomings in his project. I've seen other people derail under "review pressure" on this list. We're all humans, So please have mercy

I think it would be wise to discuss this on the web site as part of the library submission process. We should suggest that would-be submitters review the list archives to see the tenor and content of past reviews. We should, perhaps, refer them to specific reviews so they can see how critical reviews can be. That would go a long way toward preparing them for the (almost) inevitable in their own review.

I think that's a good idea. I don't know that it would have helped me initially, because in my confidence I would have assumed that XInt was special in that regard, but it would have made it easier to understand and deal with it afterwards. -- Chad Nelson Oak Circle Software, Inc. * * *

Dave Abrahams

4:16 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Mon, Mar 7, 2011 at 11:05 AM, Chad Nelson <chad.thecomfychair@gmail.com> wrote:

...

Again, I don't know anything about the history, but whatever may have come before, this testimonial instills confidence in me that Chad can handle what's coming down the road. I love it when the process works :-) -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Stewart, Robert

4:37 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

Chad Nelson wrote:

...

I understand the big fish in a small pond problem. Boost can be amazingly humbling in that regard.

...

I do hope you can regain some perspective on XInt versus other parts of your life!

...

You learned the hard way just how small your pond had been.

...

So far I see a very promising difference.

...

Don't I know it!

...

That you took someone's (strongly expressed?) opinion as "instructions" was clearly a mistake. If they convinced you, then you should stand by the decision. If not, you shouldn't change for their sake. Obviously, researching the list to discover how well received a poster's opinions are and how much in line their suggestions are with existing Boost opinion and practice will help you maximize the chance of acceptance.

...

Humility is not innate to most. How it manifests varies, of course. Your vision for the library should be paramount. It is your library and you must maintain it. However, recognizing that you don't know everything, you should be remain open to being convinced by superior arguments.

...

I understand the idea of marketing. I also understand that what you wrote was in some ways inconsistent (see my "staccato" comment), overstated (see our "complete" discussion), or misleading (see the "fast" discussions). For example, on the "fast" part, if you had said that portability was your first priority, but that speed was a close second, you would have gained your point without the negative backlash that ensued.

...

You didn't offend me and I'm sorry that I phrased it in a way that "stung."

...

We'll have to disagree. As I said, if there were points of confusion or omission, then the documentation wasn't complete, according to my definition. Since such deficiencies can arise due to maintenance in the future, it would be better to let Boost's reputation for documentation speak for your library (assuming acceptance, of course). _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

6:03 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Mon, 7 Mar 2011 11:37:24 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

...
I'm confident to the point of arrogance. It's a personality flaw that I know and detest in myself, and I've worked for years to mute it. I'm also by far the best programmer that I'd ever met before I joined this list, which exacerbates the problem greatly because I've never before had any reason to be humble in that area.

I understand the big fish in a small pond problem. Boost can be amazingly humbling in that regard.

Yeah, I've noticed. ;-)

...

...
To make things worse, XInt is the product of not just passion, but obsession. No completely sane human works on something like that for twelve or more hours a day, seven days a week, for months on end. There were several factors that contributed to that, but they're irrelevant.

I do hope you can regain some perspective on XInt versus other parts of your life!

Once I was able to call it ready, I recovered.

...

...
So when I joined this list and offered XInt, I subconsciously expected to be welcomed with open arms and lauded for my contribution. Instead, the masterpiece I was so proud of was torn apart and left in a bloody heap. Six times.

You learned the hard way just how small your pond had been.

Not for the first time, but it was the first time in a very long time.

...

...
...
I expected a humbler reaction to the design criticisms given the number of things those on this list obviously know about modern C++.

The people on this list vary in skill, and often even several people who seem equally skilled have given me exactly the opposite advice [...]

That you took someone's (strongly expressed?) opinion as "instructions" was clearly a mistake.

With the benefit of hindsight, yes. I was originally treating the Boost list members as a single client and trying to satisfy all of the concerns brought up. I have to say, this group makes a particularly difficult client. :-) I've dealt with a couple just as bad, but never with a project I was so emotionally caught up in.

...

If they convinced you, then you should stand by the decision. If not, you shouldn't change for their sake. [...]

The hard part is knowing when someone actually has a good idea that I'm not familiar with, and should be listened to, and when I've got more experience in the matter at hand and should stand firm.

...

[...] Your vision for the library should be paramount. It is your library and you must maintain it. However, recognizing that you don't know everything, you should be remain open to being convinced by superior arguments.

When I can recognize them as superior, I am. :-)

...

I understand the idea of marketing. I also understand that what you wrote was in some ways inconsistent (see my "staccato" comment), overstated (see our "complete" discussion), or misleading (see the "fast" discussions). For example, on the "fast" part, if you had said that portability was your first priority, but that speed was a close second, you would have gained your point without the negative backlash that ensued.

Which is confusing to me, because to my eyes, it's merely a rephrasing of what's already there. One that fails to convey the enthusiasm I felt at the time, and still do to a lesser extent.

...

...
Your saying that my "complete and carefully maintained documentation" comment was excessive stung, regardless of whether it was correct or not. I reacted badly to that. I'm sorry that I did, and I publicly apologize if it offended you.

You didn't offend me and I'm sorry that I phrased it in a way that "stung."

You couldn't know how much time, attention, and care went into those files. If you did, you'd likely have phrased it differently. Regardless, I overreacted. -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

6:13 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

Chad Nelson wrote:

...

Of course I'm just rephrasing, but the result is different. Your version is to say that the library "is fast." Fast isn't an objective criteria, so the reader will ask, "Relative to what?" Since you don't specify the point of comparison and provide no benchmarks to explain, the reader then is free to suppose any point of comparison of their choosing. In my version, the statement is that speed is an important goal. How well you met that goal, how much it suffered due to portability being more important, etc., is left unspecified. That leaves the reader with a "warm fuzzy feeling" that you thought performance was important, but no room to infer their own point of comparison. My version may not reveal your enthusiasm as obviously, but it comes close I think. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Jeremy Maitin-Shepard

7:12 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On 03/07/2011 10:13 AM, Stewart, Robert wrote:

...

The documentation should primarily be accurate, rather than serve as marketing material. Frankly, talking about speed as a goal at all seems rather misleading, at least from what has been said in this discussion (I haven't examined the code nor run any benchmarks myself). Phil Endecott's benchmarks showed that for small fixed sizes, xint is hundreds of times slower than a hand-coded version, which I think we can all agree means that in this case xint can only be considered "very slow". For larger sizes, benchmarks comparing it to GMP and other alternative libraries are critical, I'd say. It appears, though, that anyone caring about speed would want to use GMP or another library, rather than xint. Speed is one of the primary reason that people use C++ at all, and that is why it is critical for C++ libraries to be as efficient as possible. This is particularly the case for a library like xint with the purpose of numeric computations, which may constitute a significant fraction of the program runtime. (In contrast, libraries like function and shared_ptr should obviously be as fast as possible, but we can expect that they will contribute to only a tiny fraction of the program run time.) The argument has been made that xint may serve primarily as a way to test an interface, and after it is standardized, compiler vendors will produce optimized implementations. However, Boost to a large extent has the reputation of containing, for the purposes its libraries support, the very best C++ libraries, rather than being merely a testing ground. Furthermore, I expect that even if its interface is somewhat better than that of GMP or other libraries, it will simply be unacceptable to most potential users that it is significantly slower, and therefore its interface won't actually get the testing that is desired.

Chad Nelson

8 Mar 8 Mar

3:10 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Mon, 7 Mar 2011 13:13:02 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

Okay, you've convinced me. -- Chad Nelson Oak Circle Software, Inc. * * *

Scott McMurray

7 Mar 7 Mar

9:18 p.m.

New subject: [xint] Boost.XInt formal review (Chad Nelson)

On Mon, Mar 7, 2011 at 08:37, Stewart, Robert <Robert.Stewart@sig.com> wrote:

...

That you took someone's (strongly expressed?) opinion as "instructions" was clearly a mistake. If they convinced you, then you should stand by the decision. If not, you shouldn't change for their sake. Obviously, researching the list to discover how well received a poster's opinions are and how much in line their suggestions are with existing Boost opinion and practice will help you maximize the chance of acceptance.

+1 Any change a library author makes but doesn't personally agree with -- whether it's the majority opinion on the list or not -- is going to cause problems in the review since the author can't justify it to the inevitible questioning. I stopped participating heavily around version 3 or 4, but a quick look at my history suggests that at various points I suggested signed zeros, not having signed zeros, a policy-based design, and multiple types. If even one person can't be consistent (though I'm sure I had what I thought were good reasons at the time), then there's no way the whole list can be :) ~ Scott

Mathias Gaunard

5 Mar 5 Mar

7:07 p.m.

On 05/03/2011 17:02, Chad Nelson wrote:

...

On Sat, 05 Mar 2011 15:53:30 +0100 Mathias Gaunard<mathias.gaunard@ens-lyon.org> wrote:

...

...
It's not tailored to particularly large integers (well, in truth, using it with very large integers would be catastrophic, because the algorithms don't have good complexities).

It can easily handle the range of values that are used in public-key encryption, which exceed the number of atoms in the universe by many orders of magnitude. If you need far larger values than that, you probably want GMP.

I didn't perform usability tests, I just inferred this from the complexities of your algorithm, which are not quite the state of the art. It's fine if it doesn't scale very well yet, I just wouldn't want the library to restrict itself to certain applications too early.

...

...
It seems however that you library is stuck with the naive O(n^2) implementation. [...]

Only for the present. Which is something I've mentioned several times.

Sorry, I haven't read all messages.

...

...
On a side note, every algorithm that depends on multiplication should take a functor that specifies which multiplication algorithm to use (with a suitable default). This way, I can provide my multiplication that uses an optimized FFT routine to your algorithms, without requiring your library to contain all possible optimized implementations of multiplication.

That complicates the interface significantly, and would be completely unnecessary once the library matures enough to have faster multiplication algorithms.

You're forgetting several things here: first, some algorithms might be faster than others depending on the situation, and second, you'd have to include dependency on FFTW in the core of the library to provide the fastest solution.

...

...
...
Certainly, but it doesn't outweigh the cost.

What cost? The cost of reorganizing your library entirely? I thought we agreed you'd have to do that anyway.

The cost of freezing the internals at a 1.0 level, or of breaking client code when the interface has to change. Which is something else I've stated several times, which leads me to believe that you're either ignoring any part of the conversation that doesn't focus explicitly on your opinions, or that you're ignoring everything else in an attempt to intimidate me into redesigning the library to suit you.

I thought you had agreed that separating data and algorithms was a good idea in another thread, since many people were pointing it out, is that not correct? I didn't have time to read all messages carefully (sorry again).

...

I don't know what culture you grew up in, but mocking someone's words, questioning his competence, and finding any excuse to belittle his work is incredibly offensive in mine. And that's the content of at least half of your messages to me, when you're not simply expressing the unbelievably arrogant and self-centered assumption that you know the best way to do something that you've never attempted, but that I've put a hell of a lot of work into developing, and that anything I've done differently is automatically wrong.

On the small chance that you really don't know how insulting your messages are, here's some advice: if you want your opinions to be taken seriously by anyone with a spine, try a little respect for the intelligence of the person on the other side of the conversation, and assume that he's competent and does things for a reason until proven otherwise, even if that reason isn't perfect. It will make people a lot more receptive to your ideas.

I am sorry you feel this way. But please realize that you failed to provide reasons when I asked for them.

...

At the moment, I don't think you'd give it a no-vote, because I don't think you'd bother actually reviewing the library unless it's out of spite.

For various reasons, some personal I suppose, I would not want Boost to contain a subpar bigint library. Therefore, I am trying to investigate whether this library is competitive with the rest of the open source solutions, or if at least it has some interesting novelties others do not. I asked for your assistance in this.

...

The only competing library that matters, so far as I can see, is GMP. And GMP has licensing issues, which is the main reason why I started writing XInt in the first place -- I didn't originally write it for Boost, I wrote it for use in one of my own commercial projects.

A quick google search shows quite a few C and C++ libraries that implement big integer facilities under a liberal license (though LGPL isn't that restrictive in the first place). Most of them only provide fixed size though.

...

That's sufficient reason right there

Wouldn't that be up to the reviewers to decide?

...

, but if you want more: there is no large-integer library with even a half-decent C++ interface, so far as I've seen. I *live* in C++, and I passionately detest working with poorly-designed programming interfaces. The old saw is "if you want something done right, do it yourself," so I did. It's a sentiment I've been sorely tempted to recommend to you as well, on a number of recent occasions.

Argument noted. Glad we cleared that up.

Chad Nelson

9:56 p.m.

On Sat, 05 Mar 2011 20:07:05 +0100 Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:

...

...
That complicates the interface significantly, and would be completely unnecessary once the library matures enough to have faster multiplication algorithms.

You're forgetting several things here: first, some algorithms might be faster than others depending on the situation, and second, you'd have to include dependency on FFTW in the core of the library to provide the fastest solution.

"Fastest Fourier Transform in the West"? If it lives up to its name (which reviews indicate that it does), that would indeed be preferable for speed for really large numbers. And with the incompatible license, it couldn't be included in XInt. That's probably the best argument for exposing the internals that I've seen yet.

...

...
The cost of freezing the internals at a 1.0 level, or of breaking client code when the interface has to change. [...]

I thought you had agreed that separating data and algorithms was a good idea in another thread, since many people were pointing it out, is that not correct? [...]

I did agree that it's a good idea for the future, but I don't want to do that at this point in XInt's existence, for the reason I stated above.

...

...
I don't know what culture you grew up in [...]

I am sorry you feel this way. But please realize that you failed to provide reasons when I asked for them.

I did so under the mistaken impression that you were being deliberately malicious. I now believe I was wrong, and I apologize. If you're willing, I'd like to start fresh.

...

...
The only competing library that matters, so far as I can see, is GMP. And GMP has licensing issues, which is the main reason why I started writing XInt in the first place -- I didn't originally write it for Boost, I wrote it for use in one of my own commercial projects.

A quick google search shows quite a few C and C++ libraries that implement big integer facilities under a liberal license (though LGPL isn't that restrictive in the first place).

Most of them only provide fixed size though.

And I believe GMP is the one that it will be compared to in most people's minds.

...

...
That's sufficient reason right there

Wouldn't that be up to the reviewers to decide?

Sorry, I should have said "that's sufficient reason for me". The licensing issue isn't as bad as I thought, I was under the impression that it was GPL-licensed, not LGPL. But even had I known that earlier, it wouldn't have stopped me from writing XInt, for the reason of interface. -- Chad Nelson Oak Circle Software, Inc. * * *

Jeremy Maitin-Shepard

3 Mar 3 Mar

6:34 p.m.

On 03/03/2011 05:46 AM, Mathias Gaunard wrote:

...

This whole explicitly passing simd<T> in place of T, which seems to be the same thing done in nt2, in order to make use of certain optimizations seems like a poor interface choice. In almost all cases, the user logically cares about a container of T, and wants to work with individual elements, and the library should ensure that SIMD operations are used whenever possible. In particular, the design of the Eigen library, at least in this respect, seems more appropriate. Perhaps I am just misinterpreting the intended semantics of using simd<T>, but if it in fact it results in the logical value_type still being T, then the question is why you would even need to specify simd at all, rather than having it always be used. I also don't see how the optimization can be externalized in the (quite common) case of doing more than element-wise operations.

Joel Falcou

6:51 p.m.

...

On 03/03/11 19:34, Jeremy Maitin-Shepard wrote: table interface dont require anything at this level. nt2::table<T> DO vectorize withotu anythign special. pack is however available as a generic sized packet with potential simdization for people wanting to write low level SIMD code.

Joel Falcou

6:54 p.m.

On 03/03/11 19:34, Jeremy Maitin-Shepard wrote:

...

std::vector<T> foo(100); transform(simd::range(v), _1 + 3) do the transform using pack internally. Again no need for user care of SIMDization. Bear with me the current lack of doc on this subject. and w/r to eigen, they have the very same low level component of SIMD pack.

Mathias Gaunard

7:31 p.m.

On 03/03/2011 19:34, Jeremy Maitin-Shepard wrote:

...

I'm talking of passing a range of 128-bit or 256-bit integers, with an integer type that would use the SIMD unit for its implementation, instead of a range of 32-bit or 64-bit integers that would use the standard built-in types that probably use the ALU. Of course, this is just an idea. I don't know if that is a sensible way to exploit SIMD in a bigint scenario, maybe you need a more global SIMD awareness.

...

I also don't see how the optimization can be externalized in the (quite common) case of doing more than element-wise operations.

A C primitive only receives a pointer to raw memory, and has to decide what code to invoke itself. Templates allow to externalize those operations to functions that depend on the types of its arguments. --------------- Also, in nt2, simd_<T> is a notation to do pattern matching on simd vector types whose elements are T. It's not something you "pass" to functions. You could also write simd_< integer_<T> > to only match simd vector types whose elements are integers. NT2 uses a hierarchical concept-based overloading system. I understand this can make code quite confusing in the absence of any docs.

Terry Golubiewski

2 Mar 2 Mar

5:46 p.m.

This is definitely a useful library. Thanks. I have a few some comments/suggestions to contribute. I would prefer than highestbit just returned bitsize_t(-1) or some predefined constant, similar to string::npos. Conversion to/from dynamic_bitset would be nice. Consider adding a function returning the number of ones in the integer. This could be used to determine parity. Consider adding a function to reverse the order of the bits. "reflect" might be a good name. Consider adding conversions to/from floating point types. Should integer_traits and numeric_limits be specialized for xint? I assume that xint is not thread-safe and just gives the user a way to force a "real" copy for multi-threaded applications (I like it!). Consider adding an endian argument to to_binary. Note that this could affect both the ordering of bits within each byte as well as the ordering of the bytes themselves. (I actually prefer the libary the way it is, assuming little-endian.) I like that xint includes move support. terry "Vladimir Prus" <vladimir@codesourcery.com> wrote in message news:201103021358.25190.vladimir@codesourcery.com...

...

Terry Golubiewski

9:59 p.m.

On second thought, I don't think copy-on-write is desirable. Move semantics and references/pointers/smart-pointers give me all the capability and control I want while simplifying the implementation (I suppose). terry ----- Original Message ----- From: "Terry Golubiewski" <tjgolubi@netins.net> Newsgroups: gmane.comp.lib.boost.devel,gmane.comp.lib.boost.user To: <boost@lists.boost.org> Cc: <boost-users@lists.boost.org> Sent: Wednesday, March 02, 2011 11:46 AM Subject: Re: [xint] Boost.XInt formal review

...

This is definitely a useful library. Thanks. I have a few some comments/suggestions to contribute.

I would prefer than highestbit just returned bitsize_t(-1) or some predefined constant, similar to string::npos. Conversion to/from dynamic_bitset would be nice. Consider adding a function returning the number of ones in the integer. This could be used to determine parity. Consider adding a function to reverse the order of the bits. "reflect" might be a good name. Consider adding conversions to/from floating point types. Should integer_traits and numeric_limits be specialized for xint? I assume that xint is not thread-safe and just gives the user a way to force a "real" copy for multi-threaded applications (I like it!). Consider adding an endian argument to to_binary. Note that this could affect both the ordering of bits within each byte as well as the ordering of the bytes themselves. (I actually prefer the libary the way it is, assuming little-endian.) I like that xint includes move support.

terry

"Vladimir Prus" <vladimir@codesourcery.com> wrote in message news:201103021358.25190.vladimir@codesourcery.com...

...
Boosters,

the formal review of the Boost.XInt library, by Chad Nelson, starts now and will run through March 11.

The documentation for the current version is available at:

http://www.oakcircle.com/xint_docs/

The source code is available via Subversion at:

http://svn.boost.org/svn/boost/sandbox/xint

or can be downloaded at:

http://www.boostpro.com/vault/index.php?action=downloadfile&filename=xint.zip

Here are the important points:

1. All comments are appreciated. Even if you don't have the time for in-depth study of everything, comments on parts of the library are welcome. In particular, if you are already using logging in your applications and have specialized requirements, you might want to directly check how the proposed library handles them.

2. The reviews and all comments should be submitted to the developers list, boost@lists.boost.org and the email should have "[xint]" prefix in the subject to make sure it's not missed. The deadline for reviews is

23:59, March 11, PST, or 2:59, March 12, EST, or 7:59 March 12, UK time, 10:59, March 12 MSK

3. Please explicitly state in your review whether the library should be accepted.

4. The general review checklist is provided below:

- What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With what compiler? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?

Thanks, Volodya

-- Vladimir Prus Mentor Graphics +7 (812) 677-68-40 _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Chad Nelson

3 Mar 3 Mar

2:58 p.m.

On Wed, 2 Mar 2011 15:59:32 -0600 "Terry Golubiewski" <tjgolubi@netins.net> wrote:

...

<http://lists.boost.org/Archives/boost/2010/05/166036.php> -- Chad Nelson Oak Circle Software, Inc. * * *

Mathias Gaunard

4:36 p.m.

On 03/03/2011 15:58, Chad Nelson wrote:

...

This never made much sense. Of course it is faster to alias a memory block and update a counter than to copy said memory black. I don't see how you can draw any kind of conclusion from those tests. If you want it to make sense, test the efficiency of moves, not of copies. But then your library copies everywhere, so you'll have to change that first (easiest way to tell you're not copying is to make your type movable only).

DE

4 Mar 4 Mar

5:23 p.m.

this is not a review hi all congratulations for you, chad, on that you got so far i hope all your effort will pay back you a good value i wanted to say a word on the copy-on-write thing as someone wisely noted on this list cow is in no way worse than plain copying in general when you make a copy you allocate new storage for the newly created copy and the allocation procedure, in general, locks a mutex in order not to corrupt the heap this is much more expensive even than atomic reference counting this only statement convinces me personally to use cow in newly created pieces of software moreover cow enables the size of the object on the stack (or in an array) ==sizeof(void*) then allocation of an array of such objects saves the heap and e.g. virtually any sorting procedure of an array with all its possible copies and moves becomes relatively cheap also it's not forbidden to implement move semantics along with cow (which appears to be trivial -- just swap the pointers) and i think that the struct storage { int_t ref_count; //other instance data value_type array[1]; }; idiom *should* be used (by default) for the cow internal implementation (by default the compiler cares about alignment of 'storage' members for you so you don't even get into it) in the end i have a question to chad: for what reason integer_t inherits its implementation *virtualy* ? you know that virtual inheritance involves (possibly unneeded) overhead -- Pavel P.S. if you notice a grammar mistake or weird phrasing in my message please point it out p.p.s. i'm the person who suggested to add algorithmic complexity estimates to the docs so blame me

Chad Nelson

5 Mar 5 Mar

6:53 p.m.

On Fri, 4 Mar 2011 20:23:33 +0300 DE <satan66613@yandex.ru> wrote:

...

Thanks! I hope so too. :-)

...

I hadn't even considered the probable mutex on the heap. The OS can likely make use of faster mutexes than ring-three code, but they'd still take time. However, as has been correctly pointed out, XInt does a lot of copy-by-value where it provides no benefit. After I change that, we'll see whether it's still as fast or faster with the copy-on-write code than without it. If there's no provable benefit to it on any common platform, I see no reason to keep it.

...

Because many (all?) of the classes it inherits from have to have access to the same data, so they all inherit from the same base class. The "virtual" there tells the compiler that all of those inheritances come from a single instance of the base class, rather than a separate one in every case. I believe this use of virtual results only in compile-time overhead. Logically there shouldn't be any need for the run-time code to even notice that virtual was used there. But there may be more going on behind the scenes than I'm aware of. -- Chad Nelson Oak Circle Software, Inc. * * *

Gordon Woodhull

7:17 p.m.

On Mar 5, 2011, at 1:53 PM, Chad Nelson wrote:

...

Virtual inheritance is designed for the general case where classes don't know what hierarchy they are part of, and so they need to lookup an offset in a dynamic table (or in older implementations, dereference a pointer). In your case, all your classes can know what the most derived class is, so a cheap static_cast to derived will do. With your design, it is possible (though unlikely) that someone else would derive from integer_t with their own virtual inheritance from integer_t_data, thus moving around the location of the virtual base. That's why the compiler can't optimize this away, AFAIK. There's a fun section on this in Lippman "Inside the C++ Object Model" pp.95-101. (Yes, I use this as an interview question.) Cheers, Gordon

Stewart, Robert

7:34 p.m.

Chad Nelson wrote:

...

One uses virtual inheritance when there is state that would be duplicated in the face of multiple inheritance via the dread diamond inheritance pattern. If your policy classes have state and there is any chance of a diamond inheritance pattern, then virtual inheritance is not unreasonable. By now, of course, you know that CRTP is the better solution in your case.

...

Virtual function or data member access, in the face of virtual inheritance, incurs runtime overhead to determine the base subobject offset, if the particular base subobject is not first. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

3 Mar 3 Mar

2:12 a.m.

On Wed, 2 Mar 2011 11:46:29 -0600 "Terry Golubiewski" <tjgolubi@netins.net> wrote:

...

This is definitely a useful library. Thanks. I have a few some comments/suggestions to contribute.

I'm always happy to hear them. :-)

...

I would prefer than highestbit just returned bitsize_t(-1) or some predefined constant, similar to string::npos.

With the current design, you can tell it what you want it to return if called on a zero. There's no truly correct answer in that case, a default of zero with a user-provided override was the best I could come up with.

...

Conversion to/from dynamic_bitset would be nice.

Added to the to-do list.

...

Consider adding a function returning the number of ones in the integer. This could be used to determine parity.

Is there a common use-case that requires this ability? I ask because I've never found a need of it myself, or seen any code that made use of it.

...

Consider adding a function to reverse the order of the bits. "reflect" might be a good name.

Given that XInt is intended primarily as a math library, would this really fit?

...

Consider adding conversions to/from floating point types.

Please see the rationale section of the documentation, under the heading "Why didn't you provide conversions to and from float/double types?"

...

Should integer_traits and numeric_limits be specialized for xint?

numeric_limits already is -- please look for the string "numeric_limits<boost::xint::integer_t" (sans quotes) in boost/xint/integer.hpp. integer_traits probably should not be, because it only adds const_min and const_max, and there aren't any fixed minimum or maximum values for most integer_t objects.

...

I assume that xint is not thread-safe and just gives the user a way to force a "real" copy for multi-threaded applications (I like it!).

It can be made entirely thread-safe, using the Threadsafe option, but this is less efficient. It's mostly thread-safe even without it, you just can's safely use integer_t objects from threads other than the one that created them. The forced-copy parameter on the integer_t constructor is there if you don't want to use the Threadsafe option, but still need to move integer_t objects between threads.

...

Consider adding an endian argument to to_binary. Note that this could affect both the ordering of bits within each byte as well as the ordering of the bytes themselves. (I actually prefer the libary the way it is, assuming little-endian.)

Sorry, but I don't entirely understand. to_binary writes the information out one byte at a time, so there aren't any byte-endian issues with it. Or do you mean to provide an option to write it out highest-byte-first too? The bit-endian problem is pretty specialized. According to Wikipedia (<https://secure.wikimedia.org/wikipedia/en/wiki/Endianness#.22Bit_endianness.22>), it's generally handled transparently. Do you see a use-case that would require such an option?

...

I like that xint includes move support.

Thanks! I intend to improve that once Boost.Move is official. Thanks for your comments. -- Chad Nelson Oak Circle Software, Inc. * * *

Jeffrey Lee Hellrung, Jr.

2 Mar 2 Mar

9:46 p.m.

On 3/2/2011 2:58 AM, Vladimir Prus wrote:

...

This (like Mathias) is not a review, but rather a random list of comments upon a first pass through the documentation (and a little through the implementation, but only because the documentation proved insufficient to answer some questions). I have two general and major comments. 1) I was hoping the bignum algorithms would be factored out and allowed to operate on, e.g., ranges of digits, rather than being coupled to the xint::integer_t data structure. I know this had been mentioned before. Chad, what are your thoughts on this? I know the purpose of this library is to provide out-of-the-box bignum functionality with minimal hassle, but, to me, the algorithms should be part of the interface as well and decoupled from the xint::integer_t data structure. 2) In a similar vein, I would like to see the cryptographic functionality decoupled from the xint::integer_t data structure as well, and released separately. Also, good things: I like the policy-based design, and I think it's the right direction. The Boost.Parameter arity thing may get onerous (see comment below), and one thing you could do to work around it is to just accept a mpl::map of tags, since I'm guessing your not using Boost.Parameter's deduction functionality (awesome and powerful, but unnecessary in this case, I think). On to more specific comments... Documentation Comments ---------------------- * http://www.oakcircle.com/xint_docs/ "Why would I use it? ... Because it's fast." Do you have any benchmarks to back this up? Fast relative to...? "Why would I use it? ... Because it has the features you need." Included in here is "[c]ryptographically-secure prime number generation". This seems a little too domain-specific to be included in this library. Is there any reason the prime number generation couldn't be factored out and, further, be bignum-implementation agnostic? Boost.Prime, for example? * http://www.oakcircle.com/xint_docs/namespaceboost_1_1xint.html integer_t<...> square (const integer_t<...> n) The complexity is just O(n^2). If you want to more precise, don't use O(...), just state the number of additions/subtractions and multiplications. integer_t<...> pow (const integer_t<...> n, const integer_t<...> e) I suspect this complexity is, really, O( # of bits in the result ), as the last one or two multiplies are what dominate. I was also thinking about adding an overload where the power is specified as an unsigned long, but I don't know if it would actually be worth it... integer_t<...> square_root (const integer_t<...> n) You should mention that the algorithm is just Newton iteration with initial guess 2^floor(log2(n)/2) (where n is the input value). Also, your complexity claim implies a bounded number of Newton iterations, independent of the inputs, as each loop through the while loop is O(n^2) (where n is the # of bits) due to the division...that may be the case, and if so, you might want to back that up with a reference or something, but I suspect you should have maybe a log(n) in there (where n is the # of bits), as Newton has "only" quadratic convergence. I believe there is an O(n) square_root algorithm that would be asymptotically faster, but...I'd have to look it up... int is_prime (const integer_t<...> n, callback_t callback=no_callback) Given the documentation, I think a more descriptive name, such as is_probably_prime_rabin_miller, would be better. Also, I'm not sure how small "infinitesimally" small is...it seems the wikipedia page only guarantees at most a 1/4^5 chance of a false positive (given the current implementation), but in practice it states the chance is much much less. Perhaps you should allow the iteration count (the "k" from the wikipedia article) to be passed as a parameter as well (possibly defaulted), for those that want more control on their primality test. But, similar to a comment above, I'm not sure any cryptographic or number theoretic features outside of basic bignum arithmetic and operations should be included in this library; I would like to see them factored out and independent of xint. * http://www.oakcircle.com/xint_docs/structboost_1_1xint_1_1options_1_1threads... "Note that even with this option, it is still your responsibility to ensure that only one thread accesses a particular integer object at a time." You seem to imply that it is not thread safe under any circumstances for multiple threads to even have *read* access to an xint::integer_t object. Is that the case, and if so, why? * http://www.oakcircle.com/xint_docs/classboost_1_1xint_1_1integer__t.html template<...> class boost::xint::integer_t<> "You can specify several template parameters for the type. Most of these parameters must be specified via helper classes defined in the boost::xint::options namespace. See the list and descriptions there." So that covers "most" of the template parameters...what about the rest? Also, it would be helpful to document here what the default settings are. integer_t<...> & operator+= (const integer_t<...> b) integer_t<...> & operator-= (const integer_t<...> b) Why are these parameters passed by value? int sign (bool signed_zero=false) const I think you should separate the 2 different meanings of signedness into 2 separate functions. There are performance considerations, of course, but it's also difficult to the casual reader to infer what x.sign(true) is suppose to mean without looking it up. A second function called sign_with_signed_zero might be more appropriate. * http://www.oakcircle.com/xint_docs/threadsafe.html "When using copy_on_write, identical integer objects are allowed to share storage, which is more efficient for both CPU power and memory..." I'm (still) not sold that copy-on-write gives any noticeable performance benefits in this library over move semantics except in contrived examples, so I think some kind of justification of the above claim would be justified. I'm also not a fan of the integer_t::integer_t(const integer_t& b, bool force_thread_safety) constructor; isn't there already an "ensure_unique" member function already there (in the implementation) that ensures an integer_t object has sole ownership of its storage? Indeed, looking more at some source files, I would guess this is _make_unique. I would consider such a member function preferable to this constructor overload, especially given how the mysterious boolean parameter looks in client code (see integer_t::sign comment above). * http://www.oakcircle.com/xint_docs/basic__types__and__includes_8hpp.html "#define BOOST_PARAMETER_MAX_ARITY 6" This is not guaranteed to actually have any effect. Think about if Boost.Parameter gets included before any xint header does. Better (and, yes, more onerous) to document that xint::integer_t uses Boost.Parameter and hence the user must set this macro to at least 6. Miscellaneous Comments ---------------------- Why is digit_t in the xint::detail namespace, when it is expected to be part of the interface for purposes of supplying an allocator? How did you create your documentation? Your documentation generator may appreciate an acknowledgment in your documentation. Also, regarding the complexity notation, it would probably be best to state the complexity in terms of some other letter than n, so as not to confuse n with the input value. I know I did *not* do this in my comments above, as I was trying to follow your notation, but it easily gets confusing. Or just use log(n) instead of n to mean the # of bits or digits. That's it for now, - Jeff

Chad Nelson

3 Mar 3 Mar

2:57 p.m.

On Wed, 02 Mar 2011 13:46:33 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

1) I was hoping the bignum algorithms would be factored out and allowed to operate on, e.g., ranges of digits, rather than being coupled to the xint::integer_t data structure. I know this had been mentioned before.

And I thought I had answered it, by moving almost all of the functions out of the integer_t class. Apparently I misunderstood your intent, though it's something I'd have wanted to do eventually anyway.

...

Chad, what are your thoughts on this? I know the purpose of this library is to provide out-of-the-box bignum functionality with minimal hassle, but, to me, the algorithms should be part of the interface as well and decoupled from the xint::integer_t data structure.

Maybe I'm missing something... the only thing I see that that would accomplish, that the current design doesn't, is to allow people to operate on arbitrary slices of data. Is there really enough need for that to warrant it?

...

2) In a similar vein, I would like to see the cryptographic functionality decoupled from the xint::integer_t data structure as well, and released separately.

That's one of the most common reasons that people want to use a large-integer library. Separating those functions from the library would make it that much harder for potential users to get everything working.

...

Also, good things: I like the policy-based design, and I think it's the right direction. The Boost.Parameter arity thing may get onerous (see comment below), and one thing you could do to work around it is to just accept a mpl::map of tags, since I'm guessing your not using Boost.Parameter's deduction functionality (awesome and powerful, but unnecessary in this case, I think).

I believe it *is* using deduced parameters. I don't recall the terminology, but that's what Dave Abrahams called it when I described the interface.

...

On to more specific comments...

Documentation Comments ----------------------

* http://www.oakcircle.com/xint_docs/

"Why would I use it? ... Because it's fast."

Do you have any benchmarks to back this up? Fast relative to...?

Fast relative to pretty much all home-grown code. At worst, it'll be essentially the same speed as anything that doesn't use hand-written assembly language functions.

...

"Why would I use it? ... Because it has the features you need."

Included in here is "[c]ryptographically-secure prime number generation". This seems a little too domain-specific to be included in this library. Is there any reason the prime number generation couldn't be factored out and, further, be bignum-implementation agnostic? Boost.Prime, for example?

Certainly it could, but again, cryptography is the major reason for a large-integer library. And prime-number generation is a very large part of it.

...

* http://www.oakcircle.com/xint_docs/namespaceboost_1_1xint.html

integer_t<...> square (const integer_t<...> n)

The complexity is just O(n^2). If you want to more precise, don't use O(...), just state the number of additions/subtractions and multiplications.

Are you sure about that complexity? That function doesn't just multiply a number by itself, it uses an algorithm explicitly designed for squaring a number, which is more efficient. I think the complexity on that one is correct.

...

integer_t<...> pow (const integer_t<...> n, const integer_t<...> e)

I suspect this complexity is, really, O( # of bits in the result ), as the last one or two multiplies are what dominate.

I'm pretty sure the complexity on that one is correct as well.

...

I was also thinking about adding an overload where the power is specified as an unsigned long, but I don't know if it would actually be worth it...

It might not be too difficult to make that function take any numeric type. It's already a template function. But I suspect any speed improvement would be unnoticeable.

...

integer_t<...> square_root (const integer_t<...> n)

You should mention that the algorithm is just Newton iteration with initial guess 2^floor(log2(n)/2) (where n is the input value).

What benefit would that add to the documentation? The average developer using the library won't care what method is used. If they're interested, can't they check the implementation source code to see?

...

Also, your complexity claim implies a bounded number of Newton iterations, independent of the inputs, as each loop through the while loop is O(n^2) (where n is the # of bits) due to the division... that may be the case, and if so, you might want to back that up with a reference or something, but I suspect you should have maybe a log(n) in there (where n is the # of bits), as Newton has "only" quadratic convergence.

That's one of the complexity values that I'm not sure of. I haven't been able to find any reference that pins down a value for it, so that was an educated guess.

...

I believe there is an O(n) square_root algorithm that would be asymptotically faster, but...I'd have to look it up...

That's the best one that I was able to come up with in the time I had. If you can point me to a good description of a more efficient one, I'll be happy to implement it.

...

int is_prime (const integer_t<...> n, callback_t callback=no_callback)

Given the documentation, I think a more descriptive name, such as is_probably_prime_rabin_miller, would be better.

Maybe, but a lot more typing for an arguable benefit. The documentation specifies its limitations, I believe that's sufficient.

...

Also, I'm not sure how small "infinitesimally" small is...it seems the wikipedia page only guarantees at most a 1/4^5 chance of a false positive (given the current implementation), but in practice it states the chance is much much less.

Applied Cryptography, Second Edition, page 260 states that "for a 256-bit n, the odds of error in six tests are less than 1 in 2^51", and the odds get slimmer the larger the number is. (On the previous page, it says of 2^50 that "[...] if you consider that the odds of the number being composite are 300 million times less than the odds of winning top prize in a state lottery, you might not worry about it so much.") It also recommends only five tests, which is what the library uses.

...

Perhaps you should allow the iteration count (the "k" from the wikipedia article) to be passed as a parameter as well (possibly defaulted), for those that want more control on their primality test.

The documentation does say that if you want even greater certainty, you can run it through that function multiple times. A handful of runs through that should be more than sufficient for any purpose the library will conceivably be put to.

...

http://www.oakcircle.com/xint_docs/structboost_1_1xint_1_1options_1_1threads...

"Note that even with this option, it is still your responsibility to ensure that only one thread accesses a particular integer object at a time."

You seem to imply that it is not thread safe under any circumstances for multiple threads to even have *read* access to an xint::integer_t object. Is that the case, and if so, why?

Any number of threads can *read* an integer simultaneously. But if even one is *writing* to the integer, all bets are off -- no other thread can safely access it for any reason until the write is done. This is extensively detailed on the page titled "The threadsafe Template Parameter", off the index page.

...

http://www.oakcircle.com/xint_docs/classboost_1_1xint_1_1integer__t.html

template<...> class boost::xint::integer_t<>

"You can specify several template parameters for the type. Most of these parameters must be specified via helper classes defined in the boost::xint::options namespace. See the list and descriptions there."

So that covers "most" of the template parameters...what about the rest?

There's only one other, the custom allocator. It's mentioned in the description on that page as well, at the bottom.

...

Also, it would be helpful to document here what the default settings are.

True, and thanks for pointing it out. Added to the to-do list.

...

integer_t<...> & operator+= (const integer_t<...> b) integer_t<...> & operator-= (const integer_t<...> b)

Why are these parameters passed by value?

This was mentioned in an earlier message. I didn't recall the reason then, but I've looked into it: it's partly due to Dave Abraham's "Want Speed? Pass by Value." article (possibly by misunderstanding, I'll review that in the next few days to find out), and partly because the internal COW system makes copies very cheap -- I suspect, as cheap as pass-by-reference. Most of those pass-by-value functions should probably be changed, we'll see what I come up with.

...

int sign (bool signed_zero=false) const

I think you should separate the 2 different meanings of signedness into 2 separate functions. There are performance considerations, of course, but it's also difficult to the casual reader to infer what x.sign(true) is suppose to mean without looking it up. A second function called sign_with_signed_zero might be more appropriate.

It would be easy enough to add it, and it could occasionally help write self-documenting code, so I've put it on the list.

...

* http://www.oakcircle.com/xint_docs/threadsafe.html

"When using copy_on_write, identical integer objects are allowed to share storage, which is more efficient for both CPU power and memory..."

I'm (still) not sold that copy-on-write gives any noticeable performance benefits in this library over move semantics except in contrived examples, so I think some kind of justification of the above claim would be justified.

We went through that, at exhaustive length, last year. The final results are here: <http://lists.boost.org/Archives/boost/2010/05/166036.php>.

...

I'm also not a fan of the integer_t::integer_t(const integer_t& b, bool force_thread_safety) constructor; isn't there already an "ensure_unique" member function already there (in the implementation) that ensures an integer_t object has sole ownership of its storage? Indeed, looking more at some source files, I would guess this is _make_unique. I would consider such a member function preferable to this constructor overload, especially given how the mysterious boolean parameter looks in client code (see integer_t::sign comment above).

Please see the aforementioned "The threadsafe Template Parameter" page. That Boolean parameter will be used so rarely that I don't think it's unreasonable to ask the reader of the client code to look it up if he runs across it.

...

http://www.oakcircle.com/xint_docs/basic__types__and__includes_8hpp.html

"#define BOOST_PARAMETER_MAX_ARITY 6"

This is not guaranteed to actually have any effect. Think about if Boost.Parameter gets included before any xint header does. Better (and, yes, more onerous) to document that xint::integer_t uses Boost.Parameter and hence the user must set this macro to at least 6.

Good point. Added to the to-do list.

...

Miscellaneous Comments ----------------------

Why is digit_t in the xint::detail namespace, when it is expected to be part of the interface for purposes of supplying an allocator?

At the time I was writing the library, its definition hadn't been nailed down, so it was subject to change without notice. Now it should be stable, so I'll promote it to the xint namespace.

...

How did you create your documentation? Your documentation generator may appreciate an acknowledgment in your documentation.

Ah, good catch. I didn't notice it was missing. I've re-added it for the next update.

...

Also, regarding the complexity notation, it would probably be best to state the complexity in terms of some other letter than n, so as not to confuse n with the input value. I know I did *not* do this in my comments above, as I was trying to follow your notation, but it easily gets confusing. Or just use log(n) instead of n to mean the # of bits or digits.

Probably so. Added to the list.

...

That's it for now,

Thank you for the comments. -- Chad Nelson Oak Circle Software, Inc. * * *

Jeffrey Lee Hellrung, Jr.

5:07 p.m.

On 3/3/2011 6:57 AM, Chad Nelson wrote:

...

What difference, from the point-of-view of the interface or from the user, does it make if the algorithms are implemented in the integer_t class scope or a detail namespace scope? It seems you misunderstood the desire for decoupling the algorithms from the data structure. See also Mathias' email regarding "genericity"; I agree with his evaluation in this regard and won't repeat it. Is this decoupling the "something [you]'d have wanted to do eventually anyway", or are you referring to something else?

...

You can't assume that everyone will be satisfied with your integer_t data structure. Someone may want a data structure that uses the small-object optimization, because they have a lot of tiny integers running around and occasionally have large ones. Or they may want to use a std::deque(-like) underlying storage. I don't know. But I do know that your library will be strictly more usable if you decouple the algorithms from the integer_t data structure. I'm hesitant to vote for acceptance unless this decoupling were effected; your mileage with other boosters will obviously vary, but I gather Mathias is of the same opinion.

...

Can you be more specific on *how* it would be "much harder", from the user's perspective?

...

I'm referring to http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/index.html#dedu... and http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/reference.html#... In other words, are you specifying Boost.MPL predicates in your parameter specifications to deduce the binding of "bald" template parameters to tags/keywords?

...

Pretty bold claim, okay.

...

Yes, I get that; I just don't understand why they can't be decoupled and still work harmoniously together. Ideally, other bignum data structures should be usable with the cryptographic algorithms.

...

Ummm...yes, the stated complexity is correct, but O(n(n+1)/2) = O(n^2). So asymptotically speaking, it's not any faster than using the naive grade-school multiplication algorithm. Like I said, unless you want to give a precise operation count, it's just O(n^2).

...

Yeah, I don't think I'm right, either, but let's analyze this... The asymptotic work in your loop in detail::pow is the same as the following loop (we'll assume that e >>= 1 is a constant-time operation): for(; e != 0; e >>= 1) p = square(p); Let p have np bits. Each successive loop iteration *doubles* the number of bits in p, so the work for each successive iteration is np^2, (2np)^2, (4np)^2, ..., ((2^ne)*np)^2 (there are ne loop iterations, where ne is the number of bits in e). The total work is then (1 + 4 + 4^2 + ... + 4^ne) * np^2 = O(4^ne * np^2) = O(e^2 * np^2) By my count, you're drastically underestimating the complexity ;) Let me know if you agree with the above analysis, I could be wrong.

...

Likely.

...

There are many different algorithms to compute a square root; even users want to know what algorithms you're using so they can use your library with confidence.

...

Well, I stand by my observation here. Given that Newton has quadratic convergence, I would quote the complexity as O(log(nn)*nn^2), nn = log2(input).

...

I believe this is what I was thinking of, though the description is a bit terse: http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Digit-by-digi...

...

The "arguable" benefits are: - the name is no longer misleading; is_prime doesn't actually check if its input is prime; and - it is clear in what sense the input is determined to be probably prime. Again, think about someone *reading* code that uses this is_prime function.

...

Thanks for clearing that up. I think such a reference should be added to the documentation.

...

Agreed.

...

Okay, then all I ask is that on the originally referenced page, it should be stated that multiple reader access is okay. That's what I had guessed, but not what one might infer upon the way it is written.

...

Mentioning it here, too, would be helpful.

...

Okay.

...

I'm pretty sure a COW copy is never cheaper than a reference copy, and probably always less cheap (given the reference count that needs incrementing). Besides, COW isn't on by default...

...

sign_with_signed_zero might be a little long, but I don't have any alternate names at the moment, and you get the idea.

...

Will review.

...

That page, or your explanation here, doesn't justify why you think this constructor overload is preferable to a make_unique member function. Sorry, "because that's the way it is now" doesn't count ;)

...

If BOOST_PARAMETER_MAX_ARITY is already defined, you can check if it is at least 6, and if not, #error with a helpful error message.

...

Okay.

...

Good.

...

Okay. - Jeff

Christopher Jefferson

6:30 p.m.

On 3 Mar 2011, at 17:07, Jeffrey Lee Hellrung, Jr. wrote:

...

You can't assume that everyone will be satisfied with your integer_t data structure. Someone may want a data structure that uses the small-object optimization, because they have a lot of tiny integers running around and occasionally have large ones. Or they may want to use a std::deque(-like) underlying storage. I don't know. But I do know that your library will be strictly more usable if you decouple the algorithms from the integer_t data structure.

I'm hesitant to vote for acceptance unless this decoupling were effected; your mileage with other boosters will obviously vary, but I gather Mathias is of the same opinion.

My biggest concern with decoupling is that introduces a much larger public API. At the moment it would be possible to make very large changes to the internals of xint without changing the public interface. Also, without multiple different data structures to store the internal data, it is not clear what a clean decoupling would look like. Decoupling would make evolution of the library much more difficult, whereas decoupling later would be easier. Chris

Jeffrey Lee Hellrung, Jr.

6:42 p.m.

On 3/3/2011 10:30 AM, Christopher Jefferson wrote:

...

You're right, those are valid concerns against decoupling, and in retrospect I should've clarified that "hesitant" did not necessarily mean "opposed". At the very least, I would like to see the extraction of the algorithms independent of the data structures seriously considered. - Jeff

Chad Nelson

4 Mar 4 Mar

6:32 a.m.

On Thu, 03 Mar 2011 10:42:28 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

As my previous message mentioned, I hadn't considered the possibility of doing it later, after the internals have proven to be stable for a while. I'm certainly open to that. -- Chad Nelson Oak Circle Software, Inc. * * *

Chad Nelson

6:28 a.m.

On Thu, 3 Mar 2011 18:30:05 +0000 Christopher Jefferson <chris@bubblescope.net> wrote:

...

Thank you, that's one of the main reasons why I don't want to expose the internals. I hadn't considered exposing them later, but you're right that it would be much easier once they've stabilized. -- Chad Nelson Oak Circle Software, Inc. * * *

Chad Nelson

6:02 a.m.

On Thu, 03 Mar 2011 09:07:12 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

On 3/3/2011 6:57 AM, Chad Nelson wrote:

...
And I thought I had answered it, by moving almost all of the functions out of the integer_t class. Apparently I misunderstood your intent, though it's something I'd have wanted to do eventually anyway.

What difference, from the point-of-view of the interface or from the user, does it make if the algorithms are implemented in the integer_t class scope or a detail namespace scope? It seems you misunderstood the desire for decoupling the algorithms from the data structure.

Yes, apparently I did. I wish you'd said something about it when I asked for the last preliminary review. It's rather late to make huge changes to the library at this point.

...

See also Mathias' email regarding "genericity"; I agree with his evaluation in this regard and won't repeat it.

'Fraid I have to disagree, at least for now. I'll expand on this in replies to the later messages that I see waiting for me.

...

Is this decoupling the "something [you]'d have wanted to do eventually anyway", or are you referring to something else?

I was referring to moving almost everything out of the integer_t class.

...

...
...
2) In a similar vein, I would like to see the cryptographic functionality decoupled from the xint::integer_t data structure as well, and released separately.

That's one of the most common reasons that people want to use a large-integer library. Separating those functions from the library would make it that much harder for potential users to get everything working.

Can you be more specific on *how* it would be "much harder", from the user's perspective?

Not "much harder," but "that much harder." And no, I can't at present, as it would depend on exactly how the separate functions are made available.

...

...
...
Also, good things: I like the policy-based design, and I think it's the right direction. The Boost.Parameter arity thing may get onerous (see comment below), and one thing you could do to work around it is to just accept a mpl::map of tags, since I'm guessing your not using Boost.Parameter's deduction functionality (awesome and powerful, but unnecessary in this case, I think).

I believe it *is* using deduced parameters. I don't recall the terminology, but that's what Dave Abrahams called it when I described the interface.

I'm referring to

http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/index.html#dedu... [...]

If I understand that correctly, then yes, I'm using that.

...

...
...
* http://www.oakcircle.com/xint_docs/namespaceboost_1_1xint.html

integer_t<...> square (const integer_t<...> n)

The complexity is just O(n^2). If you want to more precise, don't use O(...), just state the number of additions/subtractions and multiplications.

Are you sure about that complexity? That function doesn't just multiply a number by itself, it uses an algorithm explicitly designed for squaring a number, which is more efficient. I think the complexity on that one is correct.

Ummm...yes, the stated complexity is correct, but O(n(n+1)/2) = O(n^2).

So asymptotically speaking, it's not any faster than using the naive grade-school multiplication algorithm. Like I said, unless you want to give a precise operation count, it's just O(n^2).

Um... if you plug a concrete number into both equations, the answers are quite different.

...

...
...
integer_t<...> pow (const integer_t<...> n, const integer_t<...> e)

I suspect this complexity is, really, O( # of bits in the result ), as the last one or two multiplies are what dominate.

I'm pretty sure the complexity on that one is correct as well.

Yeah, I don't think I'm right, either, but let's analyze this...

The asymptotic work in your loop in detail::pow is the same as the following loop (we'll assume that e >>= 1 is a constant-time operation):

for(; e != 0; e >>= 1) p = square(p);

Okay so far.

...

Let p have np bits. Each successive loop iteration *doubles* the number of bits in p, so the work for each successive iteration is np^2, (2np)^2, (4np)^2, ..., ((2^ne)*np)^2 (there are ne loop iterations, where ne is the number of bits in e).

As far as I can tell, that's correct. ((2^ne)*np)^2 seems to describe the total amount of work done.

...

The total work is then

(1 + 4 + 4^2 + ... + 4^ne) * np^2 = O(4^ne * np^2) = O(e^2 * np^2)

I followed that all the way to the last step, then I got lost. If... O(4^ne * np^2) == O(e^2 * np^2) ...then logically... 4^ne == e^2 ...but if you plug in concrete numbers, that doesn't work out. If e is 0x800 (2048), then... ne == 12 4^12 == 16777216 2048^2 == 4194304 ...and 16777216 is not equal to 4194304.

...

By my count, you're drastically underestimating the complexity ;) Let me know if you agree with the above analysis, I could be wrong.

I see that my complexity estimate is much lower than it should be. I didn't take into account that the number of bits it deals with doubles in each iteration. It looks like 4^ne should replace ne in my estimate, would you agree?

...

...
...
integer_t<...> square_root (const integer_t<...> n)

You should mention that the algorithm is just Newton iteration with initial guess 2^floor(log2(n)/2) (where n is the input value).

What benefit would that add to the documentation? The average developer using the library won't care what method is used. If they're interested, can't they check the implementation source code to see?

There are many different algorithms to compute a square root; even users want to know what algorithms you're using so they can use your library with confidence.

I don't see it, but I'll take your word for it. Documentation updated to include that.

...

...
...
Also, your complexity claim implies a bounded number of Newton iterations, independent of the inputs, as each loop through the while loop is O(n^2) (where n is the # of bits) due to the division... that may be the case, and if so, you might want to back that up with a reference or something, but I suspect you should have maybe a log(n) in there (where n is the # of bits), as Newton has "only" quadratic convergence.

That's one of the complexity values that I'm not sure of. I haven't been able to find any reference that pins down a value for it, so that was an educated guess.

Well, I stand by my observation here. Given that Newton has quadratic convergence, I would quote the complexity as O(log(nn)*nn^2), nn = log2(input).

As I'm not familiar with quadratic convergence (too long since high school), I can neither confirm nor deny that, so I'll take your word for it. I've modified the documentation.

...

...
...
I believe there is an O(n) square_root algorithm that would be asymptotically faster, but...I'd have to look it up...

That's the best one that I was able to come up with in the time I had. If you can point me to a good description of a more efficient one, I'll be happy to implement it.

I believe this is what I was thinking of, though the description is a bit terse:

http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Digit-by-digi...

I don't have time to dig into that in detail tonight, but it looks very much like a binary search algorithm, very similar to what I'm already using.

...

...
...
int is_prime (const integer_t<...> n, callback_t callback=no_callback)

Given the documentation, I think a more descriptive name, such as is_probably_prime_rabin_miller, would be better.

Maybe, but a lot more typing for an arguable benefit. The documentation specifies its limitations, I believe that's sufficient.

The "arguable" benefits are: - the name is no longer misleading; is_prime doesn't actually check if its input is prime; and - it is clear in what sense the input is determined to be probably prime.

It *does* check whether the input is prime, with a vanishingly small chance that it is mistaken.

...

Again, think about someone *reading* code that uses this is_prime function.

He would expect that it reports whether the input is prime or not, which is exactly the case. With a name like is_probably_prime_rabin_miller, he would probably feel obliged to look up the Rabin-Miller primality test and figure out what "probably" means in this context (at least, I would). I think the shorter name is preferable for the reader, as it accurately conveys the author's intent.

...

...
...
Also, I'm not sure how small "infinitesimally" small is...it seems the wikipedia page only guarantees at most a 1/4^5 chance of a false positive (given the current implementation), but in practice it states the chance is much much less.

Applied Cryptography, Second Edition, page 260 states that "for a 256-bit n, the odds of error in six tests are less than 1 in 2^51", and the odds get slimmer the larger the number is. (On the previous page, it says of 2^50 that "[...] if you consider that the odds of the number being composite are 300 million times less than the odds of winning top prize in a state lottery, you might not worry about it so much.")

It also recommends only five tests, which is what the library uses.

Thanks for clearing that up. I think such a reference should be added to the documentation.

Done.

...

...
...
http://www.oakcircle.com/xint_docs/structboost_1_1xint_1_1options_1_1threads...

"Note that even with this option, it is still your responsibility to ensure that only one thread accesses a particular integer object at a time."

You seem to imply that it is not thread safe under any circumstances for multiple threads to even have *read* access to an xint::integer_t object. Is that the case, and if so, why?

Any number of threads can *read* an integer simultaneously. But if even one is *writing* to the integer, all bets are off -- no other thread can safely access it for any reason until the write is done. This is extensively detailed on the page titled "The threadsafe Template Parameter", off the index page.

Okay, then all I ask is that on the originally referenced page, it should be stated that multiple reader access is okay. That's what I had guessed, but not what one might infer upon the way it is written.

Done.

...

...
...
http://www.oakcircle.com/xint_docs/classboost_1_1xint_1_1integer__t.html

template<...> class boost::xint::integer_t<>

"You can specify several template parameters for the type. Most of these parameters must be specified via helper classes defined in the boost::xint::options namespace. See the list and descriptions there."

So that covers "most" of the template parameters...what about the rest?

There's only one other, the custom allocator. It's mentioned in the description on that page as well, at the bottom.

Mentioning it here, too, would be helpful.

I've written a new and more detailed description on the integer_t page, noting the defaults and what alternatives each option has.

...

...
...
integer_t<...> & operator+= (const integer_t<...> b) integer_t<...> & operator-= (const integer_t<...> b)

Why are these parameters passed by value?

This was mentioned in an earlier message. I didn't recall the reason then, but I've looked into it: it's partly due to Dave Abraham's "Want Speed? Pass by Value." article (possibly by misunderstanding, I'll review that in the next few days to find out), and partly because the internal COW system makes copies very cheap -- I suspect, as cheap as pass-by-reference. Most of those pass-by-value functions should probably be changed, we'll see what I come up with.

I'm pretty sure a COW copy is never cheaper than a reference copy, and probably always less cheap (given the reference count that needs incrementing).

Yes, I've reviewed the article, and it was a misunderstanding. I'll change the parameters to const references for the next release.

...

Besides, COW isn't on by default...

It's used internally regardless, as I control everything within the library and can ensure that it's used safely. The thread-safe option only prevents it for integers returned to the user.

...

...
...
I'm also not a fan of the integer_t::integer_t(const integer_t& b, bool force_thread_safety) constructor; isn't there already an "ensure_unique" member function already there (in the implementation) that ensures an integer_t object has sole ownership of its storage? Indeed, looking more at some source files, I would guess this is _make_unique. I would consider such a member function preferable to this constructor overload, especially given how the mysterious boolean parameter looks in client code (see integer_t::sign comment above).

Please see the aforementioned "The threadsafe Template Parameter" page. That Boolean parameter will be used so rarely that I don't think it's unreasonable to ask the reader of the client code to look it up if he runs across it.

That page, or your explanation here, doesn't justify why you think this constructor overload is preferable to a make_unique member function. Sorry, "because that's the way it is now" doesn't count ;)

I had a publicly-available make_unique function in an earlier iteration of the library, and was criticized for it, so I removed it. I don't remember if it was you personally doing the criticizing, but you-the-Boost-community can't have it both ways. :-)

...

...
...
http://www.oakcircle.com/xint_docs/basic__types__and__includes_8hpp.html

"#define BOOST_PARAMETER_MAX_ARITY 6"

This is not guaranteed to actually have any effect. Think about if Boost.Parameter gets included before any xint header does. Better (and, yes, more onerous) to document that xint::integer_t uses Boost.Parameter and hence the user must set this macro to at least 6.

Good point. Added to the to-do list.

If BOOST_PARAMETER_MAX_ARITY is already defined, you can check if it is at least 6, and if not, #error with a helpful error message.

That's probably the best way to go about it. I believe Boost.Parameter defines it (as 5) if it's not already defined when the library is first encountered, so that should be easy enough. -- Chad Nelson Oak Circle Software, Inc. * * *

Ivan Le Lann

7:56 a.m.

Le vendredi 04 mars 2011 à 01:02 -0500, Chad Nelson a écrit :

...

As Mathias said, if your benchmarks used "pass by balue" everywhere, it's not a surprise a priori that COW test case was faster. Ivan

Jeffrey Lee Hellrung, Jr.

9:18 a.m.

On 3/3/2011 10:02 PM, Chad Nelson wrote:

...

On Thu, 03 Mar 2011 09:07:12 -0800 "Jeffrey Lee Hellrung, Jr."<jhellrung@ucla.edu> wrote:

...
On 3/3/2011 6:57 AM, Chad Nelson wrote:

...
And I thought I had answered it, by moving almost all of the functions out of the integer_t class. Apparently I misunderstood your intent, though it's something I'd have wanted to do eventually anyway.

What difference, from the point-of-view of the interface or from the user, does it make if the algorithms are implemented in the integer_t class scope or a detail namespace scope? It seems you misunderstood the desire for decoupling the algorithms from the data structure.

Yes, apparently I did. I wish you'd said something about it when I asked for the last preliminary review.

Well all I can say is...I'm not the only one that didn't say something... In any case, I had thought the intent was clear, based on http://lists.boost.org/Archives/boost/2010/05/166061.php and http://lists.boost.org/Archives/boost/2010/05/166068.php among other posts.

...

It's rather late to make huge changes to the library at this point.

I shouldn't think so. Do you feel this way because the library is currently under review?

...

...
See also Mathias' email regarding "genericity"; I agree with his evaluation in this regard and won't repeat it.

'Fraid I have to disagree, at least for now. I'll expand on this in replies to the later messages that I see waiting for me.

Okay, looking forward to it.

...

...
Is this decoupling the "something [you]'d have wanted to do eventually anyway", or are you referring to something else?

I was referring to moving almost everything out of the integer_t class.

Okay, well I guess you've done that, although I'm not sure what it has afforded us.

...

...
...
...
2) In a similar vein, I would like to see the cryptographic functionality decoupled from the xint::integer_t data structure as well, and released separately.

That's one of the most common reasons that people want to use a large-integer library. Separating those functions from the library would make it that much harder for potential users to get everything working.

Can you be more specific on *how* it would be "much harder", from the user's perspective?

Not "much harder," but "that much harder." And no, I can't at present, as it would depend on exactly how the separate functions are made available.

I contend it wouldn't be any harder than using, say, Boost.MPL together with Boost.TypeTraits. I.e., if you design each library with interoperability with outside modules in mind, it should be little problem. [...]

...

...
...
...
* http://www.oakcircle.com/xint_docs/namespaceboost_1_1xint.html

integer_t<...> square (const integer_t<...> n)

The complexity is just O(n^2). If you want to more precise, don't use O(...), just state the number of additions/subtractions and multiplications.

Are you sure about that complexity? That function doesn't just multiply a number by itself, it uses an algorithm explicitly designed for squaring a number, which is more efficient. I think the complexity on that one is correct.

Ummm...yes, the stated complexity is correct, but O(n(n+1)/2) = O(n^2).

So asymptotically speaking, it's not any faster than using the naive grade-school multiplication algorithm. Like I said, unless you want to give a precise operation count, it's just O(n^2).

Um... if you plug a concrete number into both equations, the answers are quite different.

Um... I'm not really sure how to respond to this. It's not a huge deal as far as the documentation is concerned, but if you *really* think that O(n^2) and O(n*(n+1)/2) are different guys, then I don't have a lot of confidence in the rest of your (as-yet-unchecked) complexity claims... (And, for what it's worth, Big-Theta statements would be more precise, where they apply, but that's often how Big-Oh is practically interpreted anyway.)

...

...
...
...
integer_t<...> pow (const integer_t<...> n, const integer_t<...> e)

I suspect this complexity is, really, O( # of bits in the result ), as the last one or two multiplies are what dominate.

I'm pretty sure the complexity on that one is correct as well.

Yeah, I don't think I'm right, either, but let's analyze this...

The asymptotic work in your loop in detail::pow is the same as the following loop (we'll assume that e>>= 1 is a constant-time operation):

for(; e != 0; e>>= 1) p = square(p);

Okay so far.

...
Let p have np bits. Each successive loop iteration *doubles* the number of bits in p, so the work for each successive iteration is np^2, (2np)^2, (4np)^2, ..., ((2^ne)*np)^2 (there are ne loop iterations, where ne is the number of bits in e).

As far as I can tell, that's correct. ((2^ne)*np)^2 seems to describe the total amount of work done.

Well, in the context of above, it's only the amount of work done for the last iteration, but it's that last iteration that dominates.

...

...
The total work is then

(1 + 4 + 4^2 + ... + 4^ne) * np^2 = O(4^ne * np^2) = O(e^2 * np^2)

I followed that all the way to the last step, then I got lost.

Here I was just adding up the work done over all ne iterations. In any case, like I said, the last iteration dominates.

...

If...

O(4^ne * np^2) == O(e^2 * np^2)

...then logically...

4^ne == e^2

...but if you plug in concrete numbers, that doesn't work out. If e is 0x800 (2048), then...

ne == 12 4^12 == 16777216 2048^2 == 4194304

...and 16777216 is not equal to 4194304.

Oh, I see where you're confused. Fortunately, log2(2048) = 11, not 12 :) So that'll bump your 16777216 down by a factor of 4, which should be just right. By the way, ne here *must* be log2(e), so literally the number of bits in e, not the number of chunks, because the number of iterations is precisely log2(e). You can fudge it with additive constants (e.g., + or - 1 doesn't matter), but not multiplicative constants (e.g., by using a different base in the logarithm).

...

...
By my count, you're drastically underestimating the complexity ;) Let me know if you agree with the above analysis, I could be wrong.

I see that my complexity estimate is much lower than it should be. I didn't take into account that the number of bits it deals with doubles in each iteration. It looks like 4^ne should replace ne in my estimate, would you agree?

Uh, e^2 I think would be clearer, but otherwise, yes. [...]

...

...
...
...
I believe there is an O(n) square_root algorithm that would be asymptotically faster, but...I'd have to look it up...

That's the best one that I was able to come up with in the time I had. If you can point me to a good description of a more efficient one, I'll be happy to implement it.

I believe this is what I was thinking of, though the description is a bit terse:

http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Digit-by-digi...

I don't have time to dig into that in detail tonight, but it looks very much like a binary search algorithm, very similar to what I'm already using.

In binary, no search needed! See the snippet of code. I, too, don't have time at the moment to study it in detail, but I was at one time very familiar with it. And since it only requires bit-shifts and additions (no multiplications), I wouldn't be surprised if it were significantly faster than Newton iteration.

...

...
...
...
int is_prime (const integer_t<...> n, callback_t callback=no_callback)

Given the documentation, I think a more descriptive name, such as is_probably_prime_rabin_miller, would be better.

Maybe, but a lot more typing for an arguable benefit. The documentation specifies its limitations, I believe that's sufficient.

The "arguable" benefits are: - the name is no longer misleading; is_prime doesn't actually check if its input is prime; and - it is clear in what sense the input is determined to be probably prime.

It *does* check whether the input is prime, with a vanishingly small chance that it is mistaken.

Lol...okay, I'm glad the Boost.TypeTraits functions don't have a similar caveat.

...

...
Again, think about someone *reading* code that uses this is_prime function.

He would expect that it reports whether the input is prime or not, which is exactly the case.

Well, no, it reports "whether the input is prime, with a vanishingly small chance that it is mistaken", which is different. To me. Sorry, as a mathematician by training, prime and probably prime are distinct properties and have very precise meanings. But I'm not a cryptographer, so maybe that's just the lingo.

...

With a name like is_probably_prime_rabin_miller, he would probably feel obliged to look up the Rabin-Miller primality test and figure out what "probably" means in this context (at least, I would). I think the shorter name is preferable for the reader, as it accurately conveys the author's intent.

Like I said, my crypto is pretty...well...non-existent... So this might be a dumb question, but am I to assume that whenever one wants to know that a number is prime, *probably* prime is perfectly okay? [...]

...

...
Besides, COW isn't on by default...

It's used internally regardless, as I control everything within the library and can ensure that it's used safely. The thread-safe option only prevents it for integers returned to the user.

Whoa, wait a second here. I can't help but see a red flag here, so color me incredulous and possibly very naive. How can copy-on-write help you internally where just moving the internal data isn't possible? Can you give an example of the internals where you find copy-on-write necessary?

...

...
...
...
I'm also not a fan of the integer_t::integer_t(const integer_t& b, bool force_thread_safety) constructor; isn't there already an "ensure_unique" member function already there (in the implementation) that ensures an integer_t object has sole ownership of its storage? Indeed, looking more at some source files, I would guess this is _make_unique. I would consider such a member function preferable to this constructor overload, especially given how the mysterious boolean parameter looks in client code (see integer_t::sign comment above).

Please see the aforementioned "The threadsafe Template Parameter" page. That Boolean parameter will be used so rarely that I don't think it's unreasonable to ask the reader of the client code to look it up if he runs across it.

That page, or your explanation here, doesn't justify why you think this constructor overload is preferable to a make_unique member function. Sorry, "because that's the way it is now" doesn't count ;)

I had a publicly-available make_unique function in an earlier iteration of the library, and was criticized for it, so I removed it. I don't remember if it was you personally doing the criticizing, but you-the-Boost-community can't have it both ways. :-)

Well this tagged copy constructor pretty much amounts to the same thing, doesn't it? Only less efficient, since it will actually perform a deep copy in situations where it's not needed, e.g., when the object is already in the thread you want it to be owned in. And more cryptic in client code. By the way, is there ever any reason to set the boolean force_thread_safety to false? [...] - Jeff

Chad Nelson

2:29 p.m.

On Fri, 04 Mar 2011 01:18:53 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

...
It's rather late to make huge changes to the library at this point.

I shouldn't think so. Do you feel this way because the library is currently under review?

I feel this way because I'm at the end of both my patience and my freedom to devote months on end to something that isn't paying work. I spent five months of twelve- to fifteen-hour days on it last year, first writing it and then *rewriting* it six more times, trying to satisfy the people on this list. When I asked for comments on the seventh iteration, I got dead silence, which I took to mean that I'd sufficiently satisfied everyone's desires. Obviously not. If the library is accepted or conditionally accepted, I'll be happy to continue improving it in my spare time, and probably even redesign it based on the suggestions I've gotten. If it is outright rejected, it dies. I'm not completely rewriting it yet again in a vain attempt to satisfy people who will never be happy with it.

...

...
...
Ummm...yes, the stated complexity is correct, but O(n(n+1)/2) = O(n^2).

So asymptotically speaking, it's not any faster than using the naive grade-school multiplication algorithm. Like I said, unless you want to give a precise operation count, it's just O(n^2).

Um... if you plug a concrete number into both equations, the answers are quite different.

Um... I'm not really sure how to respond to this. It's not a huge deal as far as the documentation is concerned, but if you *really* think that O(n^2) and O(n*(n+1)/2) are different guys, then I don't have a lot of confidence in the rest of your (as-yet-unchecked) complexity claims...

I have no formal training on big-O notation, only what knowledge I've managed to pick up over the years of trying to interpret discussions of programming algorithms. If half of roughly n^2 is somehow considered equal to n^2 in big-O notation, then obviously that isn't sufficient. I added the complexity estimates because someone on this list said that the library had to have them. Without volunteering to do any of the considerable work, I might point out. I tracked down the estimates from real mathematicians when they were available; when they weren't, I had to make educated guesses based on what I knew. Some of them will inevitably be wrong, that's the nature of the beast.

...

...
...
Let p have np bits. Each successive loop iteration *doubles* the number of bits in p, so the work for each successive iteration is np^2, (2np)^2, (4np)^2, ..., ((2^ne)*np)^2 (there are ne loop iterations, where ne is the number of bits in e).

As far as I can tell, that's correct. ((2^ne)*np)^2 seems to describe the total amount of work done.

Well, in the context of above, it's only the amount of work done for the last iteration, but it's that last iteration that dominates.

Hm, yes, I see that now.

...

...
...
http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Digit-by-digi...

I don't have time to dig into that in detail tonight, but it looks very much like a binary search algorithm, very similar to what I'm already using.

In binary, no search needed! See the snippet of code.

I did, that's what reminded me of a binary search.

...

I, too, don't have time at the moment to study it in detail, but I was at one time very familiar with it. And since it only requires bit-shifts and additions (no multiplications), I wouldn't be surprised if it were significantly faster than Newton iteration.

I've put it on the to-do list. If it's that much faster, I'll be happy to implement it.

...

...
...
Again, think about someone *reading* code that uses this is_prime function.

He would expect that it reports whether the input is prime or not, which is exactly the case.

Well, no, it reports "whether the input is prime, with a vanishingly small chance that it is mistaken", which is different. To me. Sorry, as a mathematician by training, prime and probably prime are distinct properties and have very precise meanings. But I'm not a cryptographer, so maybe that's just the lingo.

I'm only an amateur cryptographer, but it's generally understood in the field that it is impossible to prove that something is truly prime without actually factoring it. In saying that a number "is prime," there's always an unspoken "with a high degree of certainty" attached.

...

...
With a name like is_probably_prime_rabin_miller, he would probably feel obliged to look up the Rabin-Miller primality test and figure out what "probably" means in this context (at least, I would). I think the shorter name is preferable for the reader, as it accurately conveys the author's intent.

Like I said, my crypto is pretty...well...non-existent... So this might be a dumb question, but am I to assume that whenever one wants to know that a number is prime, *probably* prime is perfectly okay?

Yes. It has to be, since there's no known way to factor numbers of the size used in cryptography in anything approaching a reasonable amount of time. Every SSL or SSH session you use employs such "probably prime" numbers, as does every other application of public-key encryption.

...

...
...
Besides, COW isn't on by default...

It's used internally regardless, as I control everything within the library and can ensure that it's used safely. The thread-safe option only prevents it for integers returned to the user.

Whoa, wait a second here. I can't help but see a red flag here, so color me incredulous and possibly very naive. How can copy-on-write help you internally where just moving the internal data isn't possible? Can you give an example of the internals where you find copy-on-write necessary?

Only the benchmarks that I came up with before, and referred you to earlier in this thread. They may well change when I correct the pass-by-value stuff.

...

...
I had a publicly-available make_unique function in an earlier iteration of the library, and was criticized for it, so I removed it. I don't remember if it was you personally doing the criticizing, but you-the-Boost-community can't have it both ways. :-)

Well this tagged copy constructor pretty much amounts to the same thing, doesn't it? Only less efficient, since it will actually perform a deep copy in situations where it's not needed, e.g., when the object is already in the thread you want it to be owned in.

Only if the person is foolish enough to explicitly tell the library to do so. I'd like to think that any programmer using the library wouldn't be that foolish.

...

And more cryptic in client code. By the way, is there ever any reason to set the boolean force_thread_safety to false?

It should essentially *always* be false, that's why false is the default value. Either thread-safety is already on for the type that you're using (the default, by the way), in which case setting it to true is redundant, or it's off for that type, in which case you must have explicitly told the library that you're only using objects of its type in a single thread and will assume responsibility for any changes to that plan. That parameter exists solely so that, *if* you've told the library that you're only using it in a single thread, you can override that for specific copies of integer objects that must cross thread boundaries. -- Chad Nelson Oak Circle Software, Inc. * * *

John Bytheway

8:01 p.m.

On 04/03/11 14:29, Chad Nelson wrote: <snip>

...

That's not true. There is a polynomial-time primality testing algorithm: http://en.wikipedia.org/wiki/AKS_primality_test whereas there is no known polynomial time factorization algorithm. Of course, it's true that no one uses AKS in practice because it's still far slower than Miller–Rabin, and non-prime-with-sufficiently-small-probability is good enough for all crypt applications. Even so, I would name the function is_probably_prime. If you call it is_prime then I at least, and perhaps others, would guess it implements a slower algorithm like AKS, and I would worry if I saw code using it because it might be slow. John

Artyom

9:45 p.m.

----- Original Message ----

...

That is very interesting and important from theoretical point of view but is not practical from any other.

...

I'm sorry but: Does is_prime fails with probability lower then probability: a) That a bit become accidentally flipped in the CPU or RAM in this test giving wrong result? b) That earthquake is going to smash the entire building and destroy the computer next second. c) That a hacker controls the computer and changes this random bit. It is all matter of measure and interest. Anybody who works with prime numbers for cryptography is aware of probabilistic testing. I think that renaming "is_prime" to "is_probably_prime" is same as renaming the code "this_code_is_bug_free" to "this_code_is_probably_bug_free" and we both know that the second case has much higher probability to fail then the first one. Just to give some perspective. Artyom

Christopher Jefferson

9:57 p.m.

On 4 Mar 2011, at 21:45, Artyom wrote:

...

I'm sorry but:

Does is_prime fails with probability lower then probability:

a) That a bit become accidentally flipped in the CPU or RAM in this test giving wrong result? b) That earthquake is going to smash the entire building and destroy the computer next second. c) That a hacker controls the computer and changes this random bit.

The problem is that there might well be small numbers which always return the correct value (it is hard to know). As a point of information, the GAP system (a mathematical computation system I use) calls its primality function " isProbablyPrime". Mathematica just says "PrimeQ", but they make the claim that they have done extensive mathematical and practical testing, and never found a number which their probabilistic algorithm fails on. Unless someone is willing to put that amount of work in, I personally would much prefer is_probably_prime.

...

It is all matter of measure and interest. Anybody who works with prime numbers for cryptography is aware of probabilistic testing.

But people who are just writing some basic mathematics code might not. If their programs fail one run out of a million, I would consider that a bug.

...

I think that renaming "is_prime" to "is_probably_prime" is same as renaming the code "this_code_is_bug_free" to "this_code_is_probably_bug_free" and we both know that the second case has much higher probability to fail then the first one.

But if we found a bug, we would fix it ;) Chris

Chad Nelson

5 Mar 5 Mar

7:31 p.m.

On Fri, 4 Mar 2011 21:57:24 +0000 Christopher Jefferson <chris@bubblescope.net> wrote:

...

Given the extreme rarity of such numbers, that's easily believable. Even with the speed of modern computers, it could take a *very* long time to locate one, with "very long" measured in at least years, and likely decades or more on a single machine. And if they tweaked the parameters a little, they could bump the probability much, much lower, at the expense of slightly more time.

...

Unless someone is willing to put that amount of work in, I personally would much prefer is_probably_prime. [...]

Your wish is granted. :-) See my previous message for the reason. -- Chad Nelson Oak Circle Software, Inc. * * *

Marsh Ray

4 Mar 4 Mar

11:34 p.m.

On 03/04/2011 03:45 PM, Artyom wrote:

...

On the contrary, I'd put much less faith in the first batch of code than in the second. - Marsh

Chad Nelson

5 Mar 5 Mar

7:24 p.m.

On Fri, 04 Mar 2011 20:01:38 +0000 John Bytheway <jbytheway+boost@gmail.com> wrote:

...

I stand corrected. I hadn't heard about that test, and I was under the impression from my study of the field, a few years earlier, that most mathematicians didn't expect to ever find one. Thanks for pointing it out, I'll be very interested to see how it works, and exactly how much slower it is.

...

When probably-prime algorithms were the only known ones, calling it is_prime was justifiable. Given the existence of a true primality test, it's not. I'll change it for the next iteration. -- Chad Nelson Oak Circle Software, Inc. * * *

Jeffrey Lee Hellrung, Jr.

9:41 p.m.

On 3/4/2011 6:29 AM, Chad Nelson wrote:

...

I can probably speak for most everyone that we certainly appreciate the time you've put into this to get an actual, concrete bigint proposal that we can have a serious dialogue about. At the very least, it's brought out what others hope to see in a bigint library. Regarding the feedback from previous versions, you should understand that two major (to me, at least) suggestions from previous versions were: - Eliminate COW entirely, or, at the very least, don't make it mandatory. Add it on top of an existing, public infrastructure, if you really want. And if it is used, I think it would be most appropriate for an immutable data structure. As far as I can gather from your past comments, COW is still being used internally. Yes, this is largely an implementation concern, but I think a valid concern nonetheless. - Separate the arithmetic algorithms from the data structures, defining a BigInt concept or group of concepts in the process. Regarding the former, it appears we still don't have a definitive benchmark comparing COW with move emulation, and one would be nice, but we can still predict how many deep copies each strategy will require for a given expression, and except in constrained situations where copies are never modified, I don't think COW gives any benefit. Sometimes COW is worse, since you can't explicitly transfer ownership. Feel free to prove me wrong. If you can give an example we can analyze where COW performs fewer copies, it would help your cause. For example: integer x = ..., integer a = ..., integer b = ...; x = a + b; In both cases, there should should be a single allocation to hold the temporary sum a + b, which subsequently gets transferred to x. Regarding the lack of feedback on your *seventh* iteration, it's unfortunate, but I think, throughout your development process, that it was reasonable to expect: - that not everyone is going to inspect every iteration of your library; and - that you will get (by far?) the most of amount of feedback (including critical, almost negative, feedback) on the version you submit for inclusion in boost.

...

Rewriting the library for the primary purpose of satisfying the boost development list is the wrong perspective to take. If you don't view the majority of the suggested modifications as positive directions to take your library in, then maybe boost is the wrong destination for xint. For what it's worth, I will vote to not accept, but I will try to summarize my reasons for doing so in a separate email.

...

I think a quick trip to wikipedia, as suggested by Robert, should convince you. Complexity theory is certainly a valuable tool for any programmer to have, so I'd suggest getting up to speed as soon as convenient. wikipedia articles should be sufficient.

...

Fair enough. Like I said, I suggest familiarizing yourself with the concepts then. [...]

...

Sorry if this wasn't clear, so let me rephrase. Which implementation function or block of code necessitates using COW to eliminate unnecessary copying? This doesn't require benchmarking, only code analysis.

...

You didn't really address my primary comment. You can put lipstick on a pig, but it's still a pig. You've just renamed make_unique into a constructor. Is that a fair analysis?

...

Sorry, is there any reason to set the boolean force_thread_safety *explicitly* to false? I suspect the answer is "no", so I would consider this an inferior design to just have the regular copy constructor augmented with another constructor that takes a special empty tag struct whose only purpose to ensure the copy has unique storage. But ultimately it's a moot point as it's just a hack to obscure the make_unique member function into something else, and make_unique would be preferable still. And even *that* is somewhat of a moot point as I don't think COW should be used for the base integer class in the first place :/ - Jeff

Chad Nelson

6 Mar 6 Mar

1:55 a.m.

On Sat, 05 Mar 2011 13:41:16 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

Regarding the feedback from previous versions, you should understand that two major (to me, at least) suggestions from previous versions were:

- Eliminate COW entirely, or, at the very least, don't make it mandatory. [...] Yes, this is largely an implementation concern, but I think a valid concern nonetheless.

The performance numbers indicated that it provided a significant improvement. Those numbers will presumably change when I've implemented the improvements that have been suggested recently, but in the unlikely case that they still show it to be a significant improvement, I'd be very reluctant to remove it.

...

- Separate the arithmetic algorithms from the data structures, defining a BigInt concept or group of concepts in the process. Regarding the former, it appears we still don't have a definitive benchmark comparing COW with move emulation, and one would be nice, but we can still predict how many deep copies each strategy will require for a given expression, and except in constrained situations where copies are never modified, I don't think COW gives any benefit. Sometimes COW is worse, since you can't explicitly transfer ownership. Feel free to prove me wrong. [...]

I'll be sure to report the results, either way.

...

Regarding the lack of feedback on your *seventh* iteration, it's unfortunate, but I think, throughout your development process, that it was reasonable to expect: - that not everyone is going to inspect every iteration of your library; and - that you will get (by far?) the most of amount of feedback (including critical, almost negative, feedback) on the version you submit for inclusion in boost.

I naively assumed that those who showed such passion against the earlier iterations would at least look at the later one to see if their pet peeve had been addressed.

...

For what it's worth, I will vote to not accept, but I will try to summarize my reasons for doing so in a separate email.

In any case, thank you for your comments. I'm a little puzzled why you would vote to reject it though, instead of a conditional acceptance. You seem to have clear conditions you want resolved.

...

...
I have no formal training on big-O notation, only what knowledge I've managed to pick up over the years of trying to interpret discussions of programming algorithms. If half of roughly n^2 is somehow considered equal to n^2 in big-O notation, then obviously that isn't sufficient.

I think a quick trip to wikipedia, as suggested by Robert, should convince you.

That was one of the sources of my informal knowledge on the subject, and at least the last time I looked, it explained nothing that I could see about why O(n/2) == O(n).

...

...
...
Can you give an example of the internals where you find copy-on-write necessary?

Only the benchmarks that I came up with before, and referred you to earlier in this thread. They may well change when I correct the pass-by-value stuff.

Sorry if this wasn't clear, so let me rephrase. Which implementation function or block of code necessitates using COW to eliminate unnecessary copying? This doesn't require benchmarking, only code analysis.

I don't have that information at present. I'll certainly let you know if I discover specific places.

...

...
...
...
I had a publicly-available make_unique function in an earlier iteration of the library, and was criticized for it, so I removed it. I don't remember if it was you personally doing the criticizing, but you-the-Boost-community can't have it both ways. :-)

Well this tagged copy constructor pretty much amounts to the same thing, doesn't it? Only less efficient, since it will actually perform a deep copy in situations where it's not needed, e.g., when the object is already in the thread you want it to be owned in.

Only if the person is foolish enough to explicitly tell the library to do so. I'd like to think that any programmer using the library wouldn't be that foolish.

You didn't really address my primary comment. You can put lipstick on a pig, but it's still a pig. You've just renamed make_unique into a constructor. Is that a fair analysis?

Yes. The complaint, as I recall, wasn't that the make_unique function existed, but that it was part of the interface and meant to be used or at least usable by client code. I didn't understand it either.

...

...
...
And more cryptic in client code. By the way, is there ever any reason to set the boolean force_thread_safety to false?

It should essentially *always* be false, that's why false is the default value. [...]

Sorry, is there any reason to set the boolean force_thread_safety *explicitly* to false? I suspect the answer is "no", so I would consider this an inferior design [...]

Duplicating the code of that constructor function, with what I think would be a single-line change, would be a superior design? I'm not trying to be a pain, I'm honestly baffled at that assertion. -- Chad Nelson Oak Circle Software, Inc. * * *

Jeffrey Lee Hellrung, Jr.

3:15 a.m.

On 3/5/2011 5:55 PM, Chad Nelson wrote:

...

Looking forward to it. It would also help to track the number of allocations via the allocator or defining global new to ensure we get the allocation statistics one expects. [...]

...

Yes, however, in a nutshell, I feel there is too much on the "TODO" list that it precludes a full endorsement at present. I'm only one person, though. [...]

...

Right there in the definition: http://en.wikipedia.org/wiki/Big_oh#Formal_definition To paraphrase: f(x) = O(g(x)) as x -> inf iff there exists an M and x0 s.t. |f(x)| <= M * |g(x)| for all x > x0 So n = O(k*n) for any constant k (say, 1/2, or 2) simply by setting M in the above to k. Basically, multiplicative constants don't matter as far as complexity is concern, and neither do lower order additive terms. Informally, what complexity estimates tell you is the relative growth in the running time of an algorithm as you increase the problem size. [...]

...

To aid in your discovery, find any algorithm you're currently using that takes advantage of COW, and remove the "attach" and "detach" and similar COW operations and (roughly) replace them with copy and/or move operations. If you can't do that without invoking an unnecessary allocation, let me know.

...

So you didn't really faithfully address the complaint. The interface still has the make_unique functionality; you just put lipstick on it ;) More importantly though... You (or anyone else!) don't necessarily have to accept anyone's or everyone's suggestion. If you feel like you have a compelling case to reject a particular request or do things a certain way, make your case, and your case and the opposing one(s) will be evaluated on its merits. Initially, I, too, may have thought the make_unique member function exposed too much of the implementation, but it makes sense to provide in the event that you want to force an object to have sole ownership of its data for thread safety purposes. For contentious features, some kind of justification or rationale in the documentation can go a long way.

...

In the interest of list traffic, I'm going to drop this line of discussion, if you don't mind. It's not important enough to belabor (sp?). - Jeff

Chad Nelson

1:59 p.m.

On Sat, 05 Mar 2011 19:15:02 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

Noted. [...]

...

I'd gotten the idea of it, but so many papers seem to include complex formulas in the big-O notation that I thought it was standard to be as precise as possible.

...

It's an important ability, if the need for CoW is proven. Or to phrase it different, without the lipstick, it would just be a pig. ;-)

...

If I had it to do over again, I'd lurk around this list for a year or so and figure out which of the posters are most worth listening to, *then* asked for the first preliminary review. I got a lot of feedback that made no logical sense to me, but that (I thought) came from On High and had to be implemented if I wanted the library to be accepted. I originally assumed that I must not understand the concepts involved, but looking back on it, I think in many cases the person offering the opinion didn't have the specific experience with writing that kind of library to realize why their suggestion wasn't necessarily a good idea.

...

In the interests of maximizing development time, I'll accept that. -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

4 Mar 4 Mar

12:11 p.m.

Chad Nelson wrote:

...

Nevertheless, that's the way of Big-O notation. This will probably prove useful: <http://en.wikipedia.org/wiki/Big_O_notation>. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Jeffrey Lee Hellrung, Jr.

3 Mar 3 Mar

6:53 p.m.

On 3/3/2011 6:57 AM, Chad Nelson wrote:

...

I tried to find the test program in the sandbox, following the link given here: http://lists.boost.org/Archives/boost/2010/05/165858.php but failed :( Help? Thanks, - Jeff

Chad Nelson

4 Mar 4 Mar

6:37 a.m.

On Thu, 03 Mar 2011 10:53:58 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

In libs/xint/test/test_main.cpp, uncomment the "#define PERFORMANCE_TEST" line near the top and recompile it. -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

3 Mar 3 Mar

5:33 p.m.

Vladimir Prus wrote:

...

3. Please explicitly state in your review whether the library should be accepted.

There are many documentation issues to be addressed. There are design issues to address. My vote is a conditional yes. Details below.

...

- What is your evaluation of the design?

I'm concerned about the application of Boost.Move and the continued reliance on COW. I don't know Boost.Move, so I cannot comment on whether it was employed correctly or completely, but enough concern has been expressed that I'm left uncomfortable. It seems that COW has encouraged more use of by-value semantics than would normally be the case. Performance tuning with COW disabled should reveal a lot. If restoring COW is still beneficial after addressing move semantics and other performance tuning, then it may be acceptable. At any rate, such tuning should be done before acceptance because it may affect the interface. Various functions copy arguments quite inappropriately. Consider operator -=(). The argument should be a reference to const. It is only used to get the quantity to subtract from *this. A copy is not needed. This argument copying makes COW look more necessary than it otherwise would be. Such changes are necessary before acceptance because they affect the decision to keep COW. I like the direction Mathias Gaunard wants to go, but there's also great value in simply declaring an xint::integer and running with it, especially for those less comfortable with the more generic approach to things. A distinct advantage of Mathias' approach would appear to be the ability to parallelize computations. Perhaps the two directions can be included in one library. IOW, xint::integer might be a type that the library supports, while permitting clients to use the algorithms with arbitrary types. More concretely, perhaps operator +(xint::integer, xint::integer) can forward to the generic xint::add() algorithm thereby hiding the use of the algorithm behind the nice operator interface. That approach could be included in a second version of the library after its release, so I don't see it as preventing acceptance.

...

- What is your evaluation of the implementation?

I have problems with some names (called out in my documentation comments below). The implementation focuses on portability which is of paramount concern in a Boost library. I fully expect that it will need to focus on performance soon after its initial release, however. It will be a ripe target for comparisons with faster C libraries, so it will need to be fast to be as useful as its potential. Custom assembly language, etc., will not detract from portability given conditional compilation and fallback on the current general purpose code. It will just make the implementation uglier. The integer_t(const charT * str, std::size_t base = 10) and similar constructors seem out of place. Free functions that return an integer_t seem more appropriate and would be in keeping with the strtoX() functions. There are several functions with bool parameters. These are obscure when used at the call site. Better is to use enumerated types. For example, the integer_t constructor with a bool parameter named force_thread_safety, that parameter can be of the following enumerated type: enum storage_management { force_unique_storage, use_shared_storage }; Obviously, that latter change belies the use of COW and, ideally, COW will be eliminated from the implementation. Still, the parameter type is closely related to what the argument controls whereas "force_thread_safety" points to a side effect of the implementation detail selected. That extra argument will be moot if COW is removed, so this comment only applies if COW is retained. There are a number of implementation details brought into the public access section of integer_t, such as xint::detail::integer_t_data<BOOST_XINT_APARAMS>::data. Those details should be in the private access section. All references to integer_t<BOOST_XINT_APARAMS> in the integer_t class template definition should be replaced with a typedef to improve clarity in the source. Is there another purpose for xint::binary_t than as a type from which to construct an integer_t? If not, there's no need for the converting constructor to be explicit. Why is _make_unique() named as it is? The leading underscore suggests private/implementation detail, but it is clearly public, so the underscore is just odd and confusing. Why is factorial() a member function when abs(), square(), and others are not? If such functions need access to internals, that suggests the interface is deficient as others writing algorithms will not have the luxury of making their functions members. There is a great deal of duplication among the string-based constructors. Those functions should reuse a common implementation to avoid code bloat. The use of imperative programming to control what should be distinguished at compile time is disturbing. The runtime overhead may be provably insignificant, but I'd like to see that proved and documented. For example, most functions contain "if (Nothrow) { ...} else { ... }". That means there is a great deal of code carried in either case that never executes. The branches also mean there is more opportunity for bugs to creep into one variant and not the other. An application that only uses integer_t<options::nothrow> instances, there is a lot of code in those else branches that isn't needed. Far better would be for no_throw_integer to derive from integer_t. Changing this aspect, if performance and binary size tests show it to be important, may affect the set of types or interfaces in the library, and must therefore be done before acceptance. Code like the following appears in most member functions. Each such case should be factored out into a member function: if (Threadsafe == true) { r._make_unique(); data.make_unique(); } is_odd() and similar member functions are unnecessarily complicated. Here's a simplification: template<BOOST_XINT_CLASS_APARAMS> bool integer_t<BOOST_XINT_APARAMS>::is_odd() const { if (Nothrow && is_nan()) return false; return data.is_odd(); }

...

- What is your evaluation of the documentation?

The documentation is the weakest part of this library. There is no helpful flow introducing the concepts in the library. If one clicks through all of the links on the Main Page, one discovers a good deal, but too much of it comes out of order or simply requires user-directed discovery. For example, one of the early links is to the RSA Encryption example, but it relies on an understanding of many parts of the library not previously explained. The documentation should have a sequence of topics/pages to read that logically present the capabilities of the library to be acceptable. The following are my comments regarding the various pages I studied. __________ Main Page The page titles use title case, but the section headings do not. This looks inconsistent when title case links appear within the sentence case sections. _____ Why would I use it? The bullets would be better introduced with these shorter, bold-faced keys: Portable, Fast, Provides needed features, Open source, Closed source and commercial software friendly. That eliminates the redundant "Because it's" and otherwise shortens the key points of each bullet. Bullet 1 is a little verbose. Suggestion: Written in modern C++ with many different operating systems, compilers, and processors in mind, XInt automatically adapts to the most efficient implementation available. Bullet 2 could be a little less chatty. Suggestion: Portability is favored over performance, so there's no assembly language or other non-portable optimizations, yet performance is still high. Bullet 2 should probably link to performance comparison data (which I don't find in the documentation). Bullet 3 deviates from the preceding text by using a staccato style with sentence fragments and self-describing the documentation as "complete and carefully maintained" is excessive. Suggestion: XInt provides fixed and variable length integer types, modular arithmetic, cryptographically secure prime number generation, bit manipulation, and exception- or error-code-based error reporting, all within a friendly and intuitive interface. XInt's code is thread safe, but that can be disabled when not needed to maximize performance. Bullets 4 and 5 are unnecessary. Those "features" are common to all of Boost code. _____ How do I use it? Para 1 is too chatty. Suggestion: Just include <boost/xint/xint.hpp>, declare a variable of type boost::xint::integer, and use that variable as you would an int, but without worrying about overflowing the range! Refer to A Very Simple Example to see this in action. If you need the more advanced features, refer to the reference section. Why "Stand-Alone Examples" versus "Examples?" There are no other examples listed, so I think the latter is more appropriate. The items linked in Detailed Usage Information should probably include explanatory text. For example, why do I need to know about the "Not-a-Number" value? If I want to understand how thread safety is implemented in XInt, is "The threadsafe Template Parameter" what I should read? The normal approach to Boost documentation is to provide some random access to topical sections and to provide a means for sequential navigation through the documentation. This lack makes the XInt documentation less consistent and less accessible. The reference section, mentioned in para 1 of "How do I use it?" is not called out in the apparent table of contents at the bottom of the page. If someone skims the Main Page, it would be easy to miss that link. You've done a very nice job of formatting Doxygen's output. __________ A Very Simple Example Sample program output would be a useful addition. Otherwise, the example is nice. This example should be linked from the first paragraph of the "How do I use it?" section. __________ Fibonacci Numbers While this doesn't illustrate much about the use of xint::integer, it is a nice way to show the value of a variable-length integer type. However, this example is marred by not showing the output. Obviously, it is necessary to know the size of unsigned long on the system producing the output, so the program should report the size and maximum value for unsigned long. __________ RSA Encryption There should be links to the documentation for fixed-length integers and the "secure" option in the introductory text. xopt::secure, in the example, is not linked to its documentation. The example also uses xopt::negative_modulus which isn't mentioned in the introduction or comments, and is not linked to that option's documentation. The first BOOST_STATIC_ASSERT disagrees with the comment. The latter says "at least" while the former uses >. For those unfamiliar with RSA encryption, some explanation of the algorithm, especially as it applies to the computation of SizeT and Bits, would be helpful. It would be better to make the string formal parameters be references to const. The code indentation makes the access sections hard to spot. I found it surprising that the private constructor was implemented in the class definition while all other member functions were not. This was not even apparent at first because the function body was just an extension of the continued initializer list. The lack of any differentiation in naming among formal parameters, data members, and local variables, makes the code a bit harder to follow. While you may choose another approach, I like using a leading underscore for formal parameters and a trailing underscore for data members. That makes the source of each variable easy to spot when it appears in an expression. This was driven home when I saw calculate_blocksize(n) at the end of the public constructor. I looked through the ctor body to find n's declaration and was surprised to find it to be a data member. While on the matter of naming variables, "n," "d," and "e" are not descriptive names for data members. Contrast those names with "blocksize." I'll guess that "n" is for "numerator," "d" is for "denominator," and "e" is for "exponent," but the longer names would remove all doubt. xint::binary_t and to_binary() are used in number_to_binary_string() and binary_string_to_number() with no explanation. In the public constructor, if you test whether str.good() after the extractions into recordedbits, c1, c2, and c3, then there's no need to initialize those variables and you'd be sure the input was well formed. encrypt() and decrypt() use xint::powmod() without explanation. BTW, following the powmod link, I found some issues with that function's documentation. The description references "e" and "m" but the arguments are named "exponent" and "modulus." The complexity description is confusing. What are n-sub-n and n-sub-e? I find the inclusion of a testing argument quite odd. Far better would be for powmod(n, exp, mod) to call powmod_impl(n, exp, mod, false), which leaves room to call an undocumented powmod_impl() with true. What's worse is that, in the Remarks, you backhandedly describe the no_monty argument. (If you keep that remark, s/internally, if/internally if/.) generate() uses xint::strong_random_generator without its mention in the introduction. Add to that the use of the callback function, for which main() passes ::callback(), which is not explained, and this lack of description is disturbing. generate() uses invmod() without explanation. I also followed the invmod() link and found the description lacking. No doubt if I understood the mathematics involved, I'd understand the description. As it is, I have no idea how n and modulus are used to produce the result. The last complaint I can levy against this otherwise very nice example, is that you don't call attention to the value of XInt. For those that have implemented such things with built-in types, this would, I'm sure, be plain. However, your example is to illustrate the value of XInt and therefore you should illustrate alternatives or difficulties overcome. __________ Fixed-Length Integers This mentions "the integer_t template." This is the first I've noticed your naming convention. Apparently, you're using "_t" as a suffix to mean a class template. That suffix, in my experience, always denotes "type" and usually "typedef." I've never seen it used to indicate "template." My guess -- not having looked at enough of the documentation to know -- is that you wanted to reserve "integer" as a typedef for some specialization of your integer_t class template. I'd rather you used "basic_integer" as the class template name. There's at least precedent for the "basic_" prefix to indicate the class template that should be specialized to get something useful and basic_string/string exactly parallels basic_integer/integer. s/memory-efficient/memory efficient/ "They also silently discard the upper bits on any value that is too big for them, which can bite the unwary" should be made more prominent. I'd suggest a warning: WARNING: Fixed-length integers silently discard the most significant bits of any values that overflow the allocated memory. _____ Note Para 2 should start with, "By default, when calculating,..." or, "When not using the negative_modulus option with a signed, fixed-length integer...." __________ The Not-a-Number Value Bullet 2 is not quite right, I think: s/tries to throw/would otherwise throw/ Last para should probably be, "If you attempt to use a Not-a-Number in any other way, the result is another Not-a-Number, if the result is a Nothrow integer_t, or a not_a_number exception." If that is not correct or complete, please make it so, but don't leave things at "you will get a special value indicating failure." __________ Zero and "Negative Zero" Para 1, the last two sentences should be merged. Avoid starting sentences with conjunctions. Para 2, needs a little rework. Suggestion: The XInt library has limited support for "negative zero" for use in a planned unlimited-precision floating point library to be built on it...." Para 3, "the sign function" should not all be a link. Only "sign" should be a link. Para 3, s/parameter true/parameter set to true/ __________ Generating Prime Numbers Why isn't this in the Stand-Alone Examples section? This would be helpful before looking at the RSA Encryption example. Indeed, it would have explained the use of strong_random_generator and the callback function in the RSA example. As before, sample output would be helpful. __________ Exceptions and the nothrow_integer type The title of this section is not in title case. Para 1, Sent 2, avoid starting a sentence with a conjunction; you missed the inefficiency concern of exceptions, too. Suggestion: Exceptions may make your code harder to follow or write, and can certainly affect efficiency. Para 2, references "nothrow_integer" and "xint::integer" suggesting that the former is in the global namespace. Reading this page gives me the impression that the "Nothrow variants of integer_t" mentioned in the The Not-a-Number Value page are really "nothrow_integer." If so, please be consistent and link to this page from the The Not-a-Number Value page. Para 2, uses that vague "special value indicating failure" which is "most often" NaN. This needs to be documented more specifically and clearly. __________ The on_exception Function And -fno-exceptions Support Para 2, s/record any error/record an error/ There are two features of the library confusingly conflated in one page. Suggestion: "The on_exception Function "The XInt library calls a function whenever it encounters an error that would normally throw an exception. By calling [on_exception], you can customize the behavior in such cases. The installed handler function is called with the file name and line number at which the error occurred, and the exception object that would normally be thrown. If the handler returns, then XInt will deal with the error normally. That means it will throw the exception or call abort() (see [GCC's -fno-exceptions Support] for details). "GCC's -fno-exceptions Support "The XInt library includes support for GCC's -fno-exceptions option. When compiled with that option, a program cannot use try, throw, or catch statements. XInt will call abort() instead of throwing an exception to indicate errors unless a user-supplied handler is installed via [The on_exception Function]. If the handler returns, XInt will call abort() anyway. (That means the handler can be used to report the error before XInt calls abort().) "It is possible to use setjmp()/longjmp() to recover from errors when using this feature, but note that doing so will often result in memory leaks because no destructors or other cleanup code will run. "To enable this feature, define the preprocessor macro, BOOST_XINT_NO_EXCEPTIONS, before including any XInt library header." __________ The XInt Random Number System The names of the various XInt types mentioned on this page should be linked to their documentation. The linked titled, "the Prime Numbers page," is actually a reference to the Generating Prime Numbers example. __________ The threadsafe Template Parameter This page discusses "threadsafe" and "copy_on_write" so the title is misleading. Para 2 could be simplified. Suggestion: If an application never uses an integer object or its copies outside the thread that created them, it is safe to use in a multithreaded application. __________ Copying and Argument-Passing This is a great description of the subject and provides a proper context for what's in The threadsafe Template Parameter. I suggest that the content of the latter be moved to this page. __________ A Note on Algorithmic Complexity The contents of this page need to be more visible. I'd hate to suggest that you link it from each function description with a complexity expression, but at the very least, the Main Page should probably contain this text since you don't know how early users will encounter such expressions following the other links. (See my comments above regarding confusing complexity expressions which this text would partly answer.) __________ The Self-Testing Program This should probably be called the "library validation program." However, the question is whether or when a user would run this program. Certainly there's value in the test program to show how to use the library, provided the code is well commented, but as a validation suite, I'd expect Boost's testing to be sufficient. __________ Rationales This should be called "Rationale." _____ Then why didn't you follow those exactly? What is intended by "those" and "them?" Presumably, this content is meant to answer part of the first question, "Why did you do X in the interface?" It would be far better included in that page. Still, the use of "those" and "them" is confusing. If I understand your intent, this would work better: "Unfortunately, the XInt library was designed before I knew about n1744 and I adapted the library after the fact. This has lead to the following inconsistencies: " - There are no mathematical primitive functions for long long, as suggested in n1744, partly because long long is not yet standard C++ but mostly because I don't see the need. The conversion constructors are efficient enough for smaller values that there wouldn't be much noticeable benefit from those functions. " - There is no multiplication scheme with better performance than the "basecase" mentioned in §3.4 of n1692, primarily because of lack of time to research them all. Hopefully, few will use the library with numbers large enough to find the lack of more exotic algorithms overwhelming. I plan to add at least one alternative scheme in the future." _____ Why use copy-on-write? I've heard that causes problems with multi-threaded code. This is a better explanation than you provide in The threadsafe Template Parameter. _____ What's with the nothrow_integer type? Last para is phrased awkwardly. Suggestion: This design means only nothrow_integer code must check for the Not-a-Number value, so any speed penalty from doing so only applies to code using the nothrow_integer type. This also preserves the no-exceptions behavior...." _____ I've got a great idea! Why don't you do X? As phrased, this was probably good while you were developing XInt for Boost inclusion. Now that you're setting up XInt as a Boost library, this is no longer appropriate. The correct response is to submit a Trac ticket requesting the feature, possibly after asking for and discussing the feature on the Boost Users or Developers mailing lists. __________ Revision History I don't see the value of this in a Boost library's documentation. If XInt is accepted, such history will be part of Trac and will be excerpted into the Boost release notes. __________ n/a s/--/—/ throughout __________ Class List Note all classes have a description. Even the exception classes would do well to have a brief description. _____ divide_by_zero It appears you've written this class with a single constructor. I think providing a default constructor and one taking a non-default string can allow for more efficient code. If there are many cases of using the default value in the existing ctor, the compiler will insert string construction code throughout the library and, therefore, application. With a default ctor that constructs the std::invalid_argument base subject with "divide by zero error", the compiler may choose to not inline that ctor so each use will refer to a single implementation of the string construction code. _____ integer_t "This class implements the standard arbitrary-length integer type" suggests std::integer_t rather than boost::xint::integer_t. That is, the use of "standard" is confusing. The description is incomplete in that it says that "most of [the template] parameters must be specified via helper classes." There are two problems here. First, the declaration shown is "template <...>" so the list of template parameters is hidden. Second, what are the template parameters that are not "specified via helper classes defined in the boost::xint::options namespace?" In the string-based constructor descriptions, s/base-/base /. _____ integer_t(const integer_t<...> & b, bool force_thread_safety = false) The description is "This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts." Doxygen offers no guarantee about the order in which member functions are listed. Indeed, in this case, the constructor that appears above this one is the default constructor. Even were the documentation order as expected, a user that wishes to understand this particular constructor must find the other to which you refer in order to get a proper description. Please duplicate all such descriptions rather than rely on such indirection. The name of the formal parameter is baffling. I often use "that" or "rhs" for copy constructors (though I also use a leading underscore), but even without going in those directions, how about "i?" _____ integer_t(BOOST_XINT_RV_REF(type) b) There is no description for this constructor. (The parameter name comment applies here, too, of course.) _____ integer_t::hex_digits(), is_even(), is_odd(), sign() The description states that, "The nothrow version returns zero instead of throwing." There's no mention of any exception being thrown. Under what circumstances might that happen? Does "the nothrow version" mean when integer_t is instantiated with xint::options::nothrow? It would be better to say so explicitly rather than leave the reader to wonder if "nothrow" might mean an empty throw specification. __________ boost::xint::options Namespace The classes with a brief description starting with "Make" should use "Makes" to be consistent with the rest. (See fixedlength, for example.) _____ negative_absolute This name is odd. I'd have expected "absolute_values" or something of that sort. It looks like you tried to create a family of options all named with the "negative_" prefix. I think negative_handling<option> would be a better way to handle that. _____ negative_exception The brief description is, I think, wrong: s/unsigned/negative/. _____ negative_zero "negative_to_zero" would be more descriptive. _____ nothrow Using this option doesn't disable exceptions from the library so much as disable exceptions when using an instance specified with that option. Therefore, change the brief description to something like, "Disables exceptions." _____ threadsafe The short description breaks the pattern. It should be more like, "Makes an integer type that is safe to use in multiple threads." _____ Detailed Description Why is integer_t's allocator being described in the namespace documentation?

...

- What is your evaluation of the potential usefulness of the library?

Extremely useful.

...

- Did you try to use the library? With what compiler? Did you have any problems?

No.

...

- How much effort did you put into your evaluation?

I spent several hours digging through the documentation and looking through the code.

...

- Are you knowledgeable about the problem domain?

No. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

5 Mar 5 Mar

5:57 p.m.

(I apologize for getting to this message out of order, but it had a lot of information and I wanted to give it the attention it deserved.) On Thu, 3 Mar 2011 12:33:16 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

...
3. Please explicitly state in your review whether the library should be accepted.

There are many documentation issues to be addressed.

There are design issues to address.

My vote is a conditional yes. Details below.

Thank you for your review. I'll address your comments below.

...

...
- What is your evaluation of the design?

I'm concerned about the application of Boost.Move and the continued reliance on COW. I don't know Boost.Move, so I cannot comment on whether it was employed correctly or completely, but enough concern has been expressed that I'm left uncomfortable.

Unless I'm forgetting something (which is entirely possible, given my mental state over the last few days), no one has provided any concrete suggestions or criticisms about my use of Boost.Move. The copy-on-write stuff is a different story, of course.

...

It seems that COW has encouraged more use of by-value semantics than would normally be the case. Performance tuning with COW disabled should reveal a lot. If restoring COW is still beneficial after addressing move semantics and other performance tuning, then it may be acceptable. At any rate, such tuning should be done before acceptance because it may affect the interface.

I fully agree, since the review has turned up at least one issue that will likely make a big difference to the performance numbers.

...

Various functions copy arguments quite inappropriately. Consider operator -=(). The argument should be a reference to const. It is only used to get the quantity to subtract from *this. A copy is not needed. This argument copying makes COW look more necessary than it otherwise would be. Such changes are necessary before acceptance because they affect the decision to keep COW.

Already on the to-do list.

...

I like the direction Mathias Gaunard wants to go, but there's also great value in simply declaring an xint::integer and running with it, [...] IOW, xint::integer might be a type that the library supports, while permitting clients to use the algorithms with arbitrary types. [...]

Once the internals are stable, I plan to provide something like that. Until then, it would simply make some improvements impossible without breaking existing client code.

...

The implementation focuses on portability which is of paramount concern in a Boost library. I fully expect that it will need to focus on performance soon after its initial release, however. It will be a ripe target for comparisons with faster C libraries, so it will need to be fast to be as useful as its potential. Custom assembly language, etc., will not detract from portability given conditional compilation and fallback on the current general purpose code. It will just make the implementation uglier.

And require time. XInt is still very young.

...

The integer_t(const charT * str, std::size_t base = 10) and similar constructors seem out of place. Free functions that return an integer_t seem more appropriate and would be in keeping with the strtoX() functions.

I tried various forms of that in earlier incarnations. The results varied between aesthetically displeasing and downright ugly. Since I had to_string and to_binary, a set of to_integer functions would have been ideal. But to *what* integer? The type is a template with many options. The best I could come up with would have required client code like this: value = to_integer<integer>(...) Making those constructors eliminated the need to specify a type parameter to a function, allowing this instead: value = integer(...) Why force the person using the library to type both every time?

...

There are several functions with bool parameters. These are obscure when used at the call site. Better is to use enumerated types. [...]

Noted. If COW survives, I'll do that.

...

There are a number of implementation details brought into the public access section of integer_t, such as xint::detail::integer_t_data<BOOST_XINT_APARAMS>::data. Those details should be in the private access section.

If they were, I'd have to make a lot of other functions in the public interface into friends. A publicly-available, but obviously-for-internal-use-only _data function seems the lesser evil.

...

All references to integer_t<BOOST_XINT_APARAMS> in the integer_t class template definition should be replaced with a typedef to improve clarity in the source.

Is that possible? I'm pretty sure I tried it, in an attempt to de-uglify the documentation, and the compiler wouldn't accept any form of it. The BOOST_XINT_APARAMS macro, and similar ones, were the best compromise I could come up with. If you're referring to using a #define instead, as I've done with some of the internal template classes, that would cause Doxygen to show useless information.

...

Is there another purpose for xint::binary_t than as a type from which to construct an integer_t? If not, there's no need for the converting constructor to be explicit.

Changed for the next release.

...

Why is _make_unique() named as it is? The leading underscore suggests private/implementation detail, but it is clearly public, so the underscore is just odd and confusing.

Like _data, it is meant for use by library functions only, as a less-evil alternative to making all of them friend functions.

...

Why is factorial() a member function when abs(), square(), and others are not? If such functions need access to internals, that suggests the interface is deficient as others writing algorithms will not have the luxury of making their functions members.

For essentially the same reason that I'm using constructors instead of a to_integer function: anyone using them would have to type out the returned integer's type anyway. From the heading for that section of functions: Static Member Functions These are functions that return an integer_t, but don't take one as a parameter. To use them as free functions, you would have to specify the return type anyway, so they're static member functions of the type instead. It's the difference between... value = integer::factorial(...) ...and... value = factorial<integer>(...) The same amount of typing in this case, but I felt that the former syntax was preferable.

...

There is a great deal of duplication among the string-based constructors. Those functions should reuse a common implementation to avoid code bloat.

I'm sorry, I don't understand. The parts that can be factored out have been, to the BOOST_XINT_RAWINT::from_string functions, where two of the three overloads call the third to do the common work.

...

The use of imperative programming to control what should be distinguished at compile time is disturbing. The runtime overhead may be provably insignificant, but I'd like to see that proved and documented.

The runtime overhead should be nonexistent -- see the next point.

...

For example, most functions contain "if (Nothrow) { ...} else { ... }". That means there is a great deal of code carried in either case that never executes.

In the source, yes. In the executable code, I expect the compiler will eliminate code that is trivially proven to be unused (via "dead code elimination"). Every compiler I've ever looked at the assembly output from, up to and including the $20 Mix Power C compiler I started learning on in the late eighties, has had that capability; I doubt any compiler could be sold today without it.

...

The branches also mean there is more opportunity for bugs to creep into one variant and not the other.

That's why I've arranged them the way I have -- so I can see all the code needed for each variant in one place, and not miss updating one because I forgot to look for it.

...

An application that only uses integer_t<options::nothrow> instances, there is a lot of code in those else branches that isn't needed. Far better would be for no_throw_integer to derive from integer_t.

I had that in previous iterations, and was told that since they were so similar, I had to make them all a single class if I wanted the library to be accepted.

...

Code like the following appears in most member functions. Each such case should be factored out into a member function:

if (Threadsafe == true) { r._make_unique(); data.make_unique(); }

If it were made into a member function, it couldn't be trivially detected as dead code and eliminated in variants that don't use it, and some compilers might not be smart enough to follow the function call. That's a feature, not a bug. :-)

...

is_odd() and similar member functions are unnecessarily complicated. Here's a simplification:

template<BOOST_XINT_CLASS_APARAMS> bool integer_t<BOOST_XINT_APARAMS>::is_odd() const { if (Nothrow && is_nan()) return false; return data.is_odd(); }

Agreed, for is_even and is_odd. They've been changed. I didn't see any others in a quick scan of the code.

...

...
- What is your evaluation of the documentation?

The documentation is the weakest part of this library. There is no helpful flow introducing the concepts in the library.

Unfortunately, the majority of concepts are either trivial to any reader with a grade-school education (addition, multiplication, etcetera), or far too complex to provide any sort of tutorial for in a library (like prime number theory). If you can point out any in between, I'll be happy to document them, I'm too close to the subject to necessarily know what's not trivial.

...

If one clicks through all of the links on the Main Page, one discovers a good deal, but too much of it comes out of order or simply requires user-directed discovery. For example, one of the early links is to the RSA Encryption example, but it relies on an understanding of many parts of the library not previously explained. The documentation should have a sequence of topics/pages to read that logically present the capabilities of the library to be acceptable.

If you'll provide a point-by-point outline of what you'd like to see, I'll be happy to write it.

...

The following are my comments regarding the various pages I studied. [...]

Thank you. I note that many of these comments are simply stylistic choices, which are as personal as a developer's choice of text editor. I'll comment on a few of your comments below. Please take the others as noted, and if clear improvements over what's already there, acted on.

...

Why would I use it?

[...] Bullet 2 should probably link to performance comparison data (which I don't find in the documentation).

You don't find it because it's not there, deliberately. :-) As I said earlier today, there isn't much point to comparing a child's speed to that of an Olympic gold-medalist's.

...

Bullet 3 deviates from the preceding text by using a staccato style with sentence fragments and self-describing the documentation as "complete and carefully maintained" is excessive. [...]

I spend a great deal of time on the documentation, and not to put too fine a point on it, it's an order of magnitude better than that of even a few accepted Boost libraries (not naming any names, I don't have the time to contribute patches to them so I don't think complaining about them is fair). "Complete and carefully maintained" is, in my biased opinion, accurate, despite the current limitations you've identified (and which I'll be remedying).

...

Bullets 4 and 5 are unnecessary. Those "features" are common to all of Boost code.

Unnecessary once it's an official Boost library, but until then, sometimes you've got to toot your own horn to show people that you've got one.

...

[...] You've done a very nice job of formatting Doxygen's output.

Thanks, I've tried to pay careful attention to it. In the RSA example...

...

The code indentation makes the access sections hard to spot.

I'm not sure what you mean by "the access sections," could you clarify?

...

While on the matter of naming variables, "n," "d," and "e" are not descriptive names for data members. Contrast those names with "blocksize." I'll guess that "n" is for "numerator," "d" is for "denominator," and "e" is for "exponent," but the longer names would remove all doubt.

Unfortunately, those are the variable names traditionally used in most RSA descriptions, so changing them would make it harder for someone to follow the encryption side of it. I've added comments explaining their purpose, as well as other general improvements to that example, most of which were based on your comments. If you're curious, 'n' is used as the modulus for the public and private keys; 'd' is the decryption (i.e. private) key component; and 'e' is the encryption (i.e. public) component of the key.

...

Exceptions and the nothrow_integer type

[...] Para 1, Sent 2, avoid starting a sentence with a conjunction; you missed the inefficiency concern of exceptions, too. Suggestion: Exceptions may make your code harder to follow or write, and can certainly affect efficiency.

While that's true, XInt's nothrow_integer type does nothing about the efficiency problem in the current iteration. It simply operates by catching such exceptions internally. In a later version, if there's enough call for it, I'll change that.

...

Reading this page gives me the impression that the "Nothrow variants of integer_t" mentioned in the The Not-a-Number Value page are really "nothrow_integer." If so, please be consistent and link to this page from the The Not-a-Number Value page.

nothrow_integer is an example of a Nothrow variant, but not the only possible example.

...

integer_t(const integer_t<...> & b, bool force_thread_safety = false)

The description is "This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts." [...]

That wording comes from Doxygen, when you tell it that a function is an overload via the \overload command.

...

[...] a user that wishes to understand this particular constructor must find the other to which you refer in order to get a proper description. Please duplicate all such descriptions rather than rely on such indirection.

Sorry, but I'm confused yet again. All of the constructors already include unique descriptions. The \overload command is there only to draw attention to the fact that there are other functions that share that name.

...

integer_t::hex_digits(), is_even(), is_odd(), sign()

The description states that, "The nothrow version returns zero instead of throwing." There's no mention of any exception being thrown. Under what circumstances might that happen? [...]

Not-a-number, for instance. And any object can throw an out-of-memory exception. -- Chad Nelson Oak Circle Software, Inc. * * *

Gordon Woodhull

6:12 p.m.

On Mar 5, 2011, at 12:57 PM, Chad Nelson wrote:

...

Don't you still have to specify any template parameters here?

Chad Nelson

9:31 p.m.

On Sat, 5 Mar 2011 13:12:34 -0500 Gordon Woodhull <gordon@woodhull.com> wrote:

...

You would for the type, except that I'm fairly certain everyone will typedef it to a simple name, or use the predefined xint::integer and xint::nothrow_integer types. -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

7:15 p.m.

Chad Nelson wrote:

...

On Thu, 3 Mar 2011 12:33:16 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

(I apologize for getting to this message out of order, but it had a lot of information and I wanted to give it the attention it deserved.)

Hah! I was just planning to search gmane for my post thinking it was lost.

...

...
...
- What is your evaluation of the design?

I'm concerned about the application of Boost.Move and the continued reliance on COW. I don't know Boost.Move, so I cannot comment on whether it was employed correctly or completely, but enough concern has been expressed that I'm left uncomfortable.

Unless I'm forgetting something (which is entirely possible, given my mental state over the last few days), no one has provided any concrete suggestions or criticisms about my use of Boost.Move.

It may be that too much else has precluded a serious look. Anyway, when you remove the excess copying and disable COW to focus on performance tuning, that should reveal a lot about using move semantics. You might even ask Ion Gaztañaga to look over the code at that point.

...

...
The integer_t(const charT * str, std::size_t base = 10) and similar constructors seem out of place. Free functions that return an integer_t seem more appropriate and would be in keeping with the strtoX() functions.

I tried various forms of that in earlier incarnations. The results varied between aesthetically displeasing and downright ugly.

Since I had to_string and to_binary, a set of to_integer functions would have been ideal. But to *what* integer? The type is a template with many options. The best I could come up with would have required client code like this:

value = to_integer<integer>(...)

Making those constructors eliminated the need to specify a type parameter to a function, allowing this instead:

value = integer(...)

Why force the person using the library to type both every time?

That is certainly an issue. auto obviates the problem, of course. You could also pass the instance as a non-const reference argument to allow for type deduction.

...

...
There are a number of implementation details brought into the public access section of integer_t, such as xint::detail::integer_t_data<BOOST_XINT_APARAMS>::data. Those details should be in the private access section.

If they were, I'd have to make a lot of other functions in the public interface into friends. A publicly-available, but obviously-for-internal-use-only _data function seems the lesser evil.

I prefer to ensure that clients cannot misuse a class even if that means using friendship (which can be constrained with, for example, the Advocate idiom).

...

...
All references to integer_t<BOOST_XINT_APARAMS> in the integer_t class template definition should be replaced with a typedef to improve clarity in the source.

Is that possible? I'm pretty sure I tried it, in an attempt to de-uglify the documentation, and the compiler wouldn't accept any form of it. The BOOST_XINT_APARAMS macro, and similar ones, were the best compromise I could come up with. If you're referring to using a #define instead, as I've done with some of the internal template classes, that would cause Doxygen to show useless information.

I don't know what's in BOOST_XINT_APARAMS, of course, but I cannot imagine a template parameter list hidden behind such a macro would preclude creating a typedef.

...

...
Why is _make_unique() named as it is? The leading underscore suggests private/implementation detail, but it is clearly public, so the underscore is just odd and confusing.

Like _data, it is meant for use by library functions only, as a less-evil alternative to making all of them friend functions.

I disagree with your assignment of relative evil to those things. Being able to misuse a class is worse than using friendship. However, in this case, it seems that the ability to dissociate an instance's memory is an important feature that will arise in other algorithms and even, in some cases, in client code. If this feature remains necessary -- presumably because of the retention of COW -- then this function should be public, even if it requires care to use reasonably.

...

...
There is a great deal of duplication among the string-based constructors. Those functions should reuse a common implementation to avoid code bloat.

I'm sorry, I don't understand. The parts that can be factored out have been, to the BOOST_XINT_RAWINT::from_string functions, where two of the three overloads call the third to do the common work.

They all have the same structure, including try blocks, conditionals, etc. There are certain differences, to be sure, but those can be factored out into more specific overloaded member functions. The result is that they can be implemented using a member function template, as an example.

...

...
An application that only uses integer_t<options::nothrow> instances, there is a lot of code in those else branches that isn't needed. Far better would be for no_throw_integer to derive from integer_t.

I had that in previous iterations, and was told that since they were so similar, I had to make them all a single class if I wanted the library to be accepted.

Well, whatever you had before and whatever the reasons for it, the result doesn't strike me as likely to be better. I'm in no position to judge between the two seeing only this version, of course.

...

...
Code like the following appears in most member functions. Each such case should be factored out into a member function:

if (Threadsafe == true) { r._make_unique(); data.make_unique(); }

If it were made into a member function, it couldn't be trivially detected as dead code and eliminated in variants that don't use it, and some compilers might not be smart enough to follow the function call. That's a feature, not a bug. :-)

I don't think an inline function will thwart dead code elimination. I really hate to see such duplication, but you have to maintain it.

...

...
...
- What is your evaluation of the documentation?

The documentation is the weakest part of this library. There is no helpful flow introducing the concepts in the library.

Unfortunately, the majority of concepts are either trivial to any reader with a grade-school education (addition, multiplication, etcetera), or far too complex to provide any sort of tutorial for in a library (like prime number theory). If you can point out any in between, I'll be happy to document them, I'm too close to the subject to necessarily know what's not trivial.

I'd expect you to show how xint::integer can be used in a normal calculation then lead the reader through some variations of the options and their effect on usage. I'd expect you to motivate reasons for the various tools through more specific examples, even partial ones, leading through the existing examples, of course. The that you don't walk a reader through the documentation in steps introducing features little by little.

...

If you'll provide a point-by-point outline of what you'd like to see, I'll be happy to write it.

I might manage the time to do so, but I must point out that it is your library and not mine.

...

...
Why would I use it?

Bullet 3 deviates from the preceding text by using a staccato style with sentence fragments and self-describing the documentation as "complete and carefully maintained" is excessive. [...]

I spend a great deal of time on the documentation, and not to put too fine a point on it, it's an order of magnitude better than that of even a few accepted Boost libraries (not naming any names, I don't have the time to contribute patches to them so I don't think complaining about them is fair). "Complete and carefully maintained" is, in my biased opinion, accurate, despite the current limitations you've identified (and which I'll be remedying).

This reply discourages my desire to offer further help. The many questions about behavior raised by the reviews and discussion should clearly show that the documentation is not yet complete.

...

In the RSA example...

...
The code indentation makes the access sections hard to spot.

I'm not sure what you mean by "the access sections," could you clarify?

public, protected, and private

...

...
integer_t(const integer_t<...> & b, bool force_thread_safety = false)

The description is "This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts." [...]

That wording comes from Doxygen, when you tell it that a function is an overload via the \overload command.

Then I suggest not using \overload.

...

...
[...] a user that wishes to understand this particular constructor must find the other to which you refer in order to get a proper description. Please duplicate all such descriptions rather than rely on such indirection.

Sorry, but I'm confused yet again. All of the constructors already include unique descriptions. The \overload command is there only to draw attention to the fact that there are other functions that share that name.

The entire description I saw was, "This is an overloaded..."

...

...
integer_t::hex_digits(), is_even(), is_odd(), sign()

The description states that, "The nothrow version returns zero instead of throwing." There's no mention of any exception being thrown. Under what circumstances might that happen? [...]

Not-a-number, for instance. And any object can throw an out-of-memory exception.

The point is that the documentation should not leave such questions open. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

6 Mar 6 Mar

1:14 a.m.

On Sat, 5 Mar 2011 14:15:45 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

Chad Nelson wrote:

...
On Thu, 3 Mar 2011 12:33:16 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

(I apologize for getting to this message out of order, but it had a lot of information and I wanted to give it the attention it deserved.)

Hah! I was just planning to search gmane for my post thinking it was lost.

Sorry again, but I knew if I didn't either make the changes you suggested immediately, or at least enter them into my issue tracker, some of them would get lost. And I kept getting distracted answering other messages... I'd actually been working through your message since a few hours after you sent it.

...

...
Unless I'm forgetting something (which is entirely possible, given my mental state over the last few days), no one has provided any concrete suggestions or criticisms about my use of Boost.Move.

It may be that too much else has precluded a serious look. Anyway, when you remove the excess copying and disable COW to focus on performance tuning, that should reveal a lot about using move semantics. You might even ask Ion Gaztañaga to look over the code at that point.

If he's interested, certainly.

...

...
...
The integer_t(const charT * str, std::size_t base = 10) and similar constructors seem out of place. Free functions that return an integer_t seem more appropriate and would be in keeping with the strtoX() functions.

I tried various forms of that in earlier incarnations. The results varied between aesthetically displeasing and downright ugly. [...]

Since I had to_string and to_binary, a set of to_integer functions would have been ideal. But to *what* integer? The type is a template with many options. The best I could come up with would have required client code like this:

value = to_integer<integer>(...)

Making those constructors eliminated the need to specify a type parameter to a function, allowing this instead:

value = integer(...)

Why force the person using the library to type both every time?

That is certainly an issue. auto obviates the problem, of course.

Would it? You'd still have to specify what integer type you want the to_integer function to return, wouldn't you?

...

You could also pass the instance as a non-const reference argument to allow for type deduction.

Possible, though it would require at least two lines instead of one (creating the integer object, then passing it to the to_integer function). Painful, from an aesthetically point of view.

...

...
If they were, I'd have to make a lot of other functions in the public interface into friends. A publicly-available, but obviously-for-internal-use-only _data function seems the lesser evil.

I prefer to ensure that clients cannot misuse a class even if that means using friendship (which can be constrained with, for example, the Advocate idiom).

In recent years, I've come to the conclusion that the Python philosophy (<http://stackoverflow.com/questions/70528/why-are-pythons-private-methods-not-actually-private/70555#70555>) has a point in that respect: regardless of public or private, if someone *really* wants to get to your class's data, he will find a way to do so. So "private" is really just a convention, a warning not to use something, probably because no attempt will be made to preserve compatibility with it in future versions. I definitely agree with making it hard to *accidentally* misuse a class. But trying to make it impossible is probably misguided. Fences and big warning signs around high-voltage power stations are good; hermetically sealed domes are likely overkill.

...

...
...
All references to integer_t<BOOST_XINT_APARAMS> in the integer_t class template definition should be replaced with a typedef to improve clarity in the source.

Is that possible? I'm pretty sure I tried it, in an attempt to de-uglify the documentation, and the compiler wouldn't accept any form of it. The BOOST_XINT_APARAMS macro, and similar ones, were the best compromise I could come up with. If you're referring to using a #define instead, as I've done with some of the internal template classes, that would cause Doxygen to show useless information.

I don't know what's in BOOST_XINT_APARAMS, of course,

A lot of opaque Boost.Parameter stuff. Or an ellipses, when compiling the documentation.

...

but I cannot imagine a template parameter list hidden behind such a macro would preclude creating a typedef.

I haven't tried it recently, but *can* you typedef a template? Not a concrete class made from a template, but the template itself, with parameters? If so, I'd dearly love to know the syntax. That's what I was trying to find a way to do, and why I finally had to settle on BOOST_XINT_APARAMS.

...

...
...
Why is _make_unique() named as it is? The leading underscore suggests private/implementation detail, but it is clearly public, so the underscore is just odd and confusing.

Like _data, it is meant for use by library functions only, as a less-evil alternative to making all of them friend functions.

I disagree with your assignment of relative evil to those things. Being able to misuse a class is worse than using friendship. However, in this case, it seems that the ability to dissociate an instance's memory is an important feature that will arise in other algorithms and even, in some cases, in client code. If this feature remains necessary -- presumably because of the retention of COW -- then this function should be public, even if it requires care to use reasonably.

Then it seems that there's some disagreement in the Boost community, because I added the underscore because I was explicitly told that it *shouldn't* be part of the documented interface.

...

...
Unfortunately, the majority of concepts are either trivial to any reader with a grade-school education (addition, multiplication, etcetera), or far too complex to provide any sort of tutorial for in a library (like prime number theory). If you can point out any in between, I'll be happy to document them, I'm too close to the subject to necessarily know what's not trivial.

I'd expect you to show how xint::integer can be used in a normal calculation then lead the reader through some variations of the options and their effect on usage. I'd expect you to motivate reasons for the various tools through more specific examples, even partial ones, leading through the existing examples, of course. The that you don't walk a reader through the documentation in steps introducing features little by little.

I've put it on the to-do list, I'll see if I can come up with something.

...

...
If you'll provide a point-by-point outline of what you'd like to see, I'll be happy to write it.

I might manage the time to do so, but I must point out that it is your library and not mine.

Certainly. I said that because you know what you'd like to see, and I can only guess at it. Anything I come up with without guidance will obviously be different from what you'd intended, maybe enough that it wouldn't suit your intent at all.

...

...
...
Why would I use it?

Bullet 3 deviates from the preceding text by using a staccato style with sentence fragments and self-describing the documentation as "complete and carefully maintained" is excessive. [...]

I spend a great deal of time on the documentation, and not to put too fine a point on it, it's an order of magnitude better than that of even a few accepted Boost libraries (not naming any names, I don't have the time to contribute patches to them so I don't think complaining about them is fair). "Complete and carefully maintained" is, in my biased opinion, accurate, despite the current limitations you've identified (and which I'll be remedying).

This reply discourages my desire to offer further help.

I'm sorry, it wasn't my intent to disparage your contribution. Your suggestions are good, and I've implemented almost all of them, or noted them for later implementation. I simply feel that "complete" is accurate, especially once the improvements are finished. And given the time I spend on it, I believe "carefully maintained" is justified as well.

...

The many questions about behavior raised by the reviews and discussion should clearly show that the documentation is not yet complete.

There's always room for improvement, even on "complete" documentation. :-)

...

...
In the RSA example...

...
The code indentation makes the access sections hard to spot.

I'm not sure what you mean by "the access sections," could you clarify?

public, protected, and private

Ah, okay. Fixed.

...

...
...
[...] a user that wishes to understand this particular constructor must find the other to which you refer in order to get a proper description. Please duplicate all such descriptions rather than rely on such indirection.

Sorry, but I'm confused yet again. All of the constructors already include unique descriptions. The \overload command is there only to draw attention to the fact that there are other functions that share that name.

The entire description I saw was, "This is an overloaded..."

Maybe it was the move constructor. In any case, they all have unique descriptions now. Again, thank you for your comments. -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

7 Mar 7 Mar

2:43 p.m.

Chad Nelson wrote:

...

"Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...
Chad Nelson wrote:

...
"Stewart, Robert" <Robert.Stewart@sig.com> wrote:

Since I had to_string and to_binary, a set of to_integer functions would have been ideal. But to *what* integer? The type is a template with many options. The best I could come up with would have required client code like this:

value = to_integer<integer>(...)

Making those constructors eliminated the need to specify a type parameter to a function, allowing this instead:

value = integer(...)

Why force the person using the library to type both every time?

That is certainly an issue. auto obviates the problem, of course.

Would it? You'd still have to specify what integer type you want the to_integer function to return, wouldn't you?

I was thinking of the type on the left hand side, which you weren't showing. Your point is valid. However, what you've named "to_integer" could be named "integer_cast" and it would look quite reasonable: value = integer_cast<integer>(...)

...

...
You could also pass the instance as a non-const reference argument to allow for type deduction.

Possible, though it would require at least two lines instead of one (creating the integer object, then passing it to the to_integer function). Painful, from an aesthetically point of view.

In your example above, value already exists, so you get this: to_integer<integer>(value, ...) Obviously, in the case of creating a new instance, it doesn't fare so well: integer value(...); vs. integer value; value = to_integer<integer>(...); Ultimately, I think the constructor is appropriate, but you might consider the function-style cast, too.

...

...
...
If they were, I'd have to make a lot of other functions in the public interface into friends. A publicly-available, but obviously-for-internal-use-only _data function seems the lesser evil.

I prefer to ensure that clients cannot misuse a class even if that means using friendship (which can be constrained with, for example, the Advocate idiom).

I definitely agree with making it hard to *accidentally* misuse a class. But trying to make it impossible is probably misguided. Fences and big warning signs around high-voltage power stations are good; hermetically sealed domes are likely overkill.

Calling a public member function is not accidentally misusing a class. Casting away constness, or some other Machiavellian tactic, is abuse and there's no need to defend against that.

...

...
...
...
All references to integer_t<BOOST_XINT_APARAMS> in the integer_t class template definition should be replaced with a typedef to improve clarity in the source.

Is that possible? I'm pretty sure I tried it, in an attempt to de-uglify the documentation, and the compiler wouldn't accept any form of it. The BOOST_XINT_APARAMS macro, and similar ones, were the best compromise I could come up with. If you're referring to using a #define instead, as I've done with some of the internal template classes, that would cause Doxygen to show useless information.

I don't know what's in BOOST_XINT_APARAMS, of course,

A lot of opaque Boost.Parameter stuff. Or an ellipses, when compiling the documentation.

...
but I cannot imagine a template parameter list hidden behind such a macro would preclude creating a typedef.

I haven't tried it recently, but *can* you typedef a template? Not a concrete class made from a template, but the template itself, with parameters? If so, I'd dearly love to know the syntax. That's what I was trying to find a way to do, and why I finally had to settle on BOOST_XINT_APARAMS.

I'm confused. integer_t<BOOST_XINT_APARAMS> cannot be a class template as it is used as a parameter type and a return type. It is a specialization of a class template, so it is a class. Therefore, you can create a typedef for it.

...

...
...
...
Why is _make_unique() named as it is? The leading underscore suggests private/implementation detail, but it is clearly public, so the underscore is just odd and confusing.

Like _data, it is meant for use by library functions only, as a less-evil alternative to making all of them friend functions.

I disagree with your assignment of relative evil to those things. Being able to misuse a class is worse than using friendship. However, in this case, it seems that the ability to dissociate an instance's memory is an important feature that will arise in other algorithms and even, in some cases, in client code. If this feature remains necessary -- presumably because of the retention of COW -- then this function should be public, even if it requires care to use reasonably.

Then it seems that there's some disagreement in the Boost community,

Now there's a shock!

...

because I added the underscore because I was explicitly told that it *shouldn't* be part of the documented interface.

If it is a necessary feature, then it must be exposed and documented, even if needed only rarely. Warn users away all you like, of course.

...

...
The many questions about behavior raised by the reviews and discussion should clearly show that the documentation is not yet complete.

There's always room for improvement, even on "complete" documentation. :-)

We apparently have a different notion of what "complete" means. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

4:48 p.m.

On Mon, 7 Mar 2011 09:43:46 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

That would still require you to specify both integer_cast and the type, which is what I was trying to get away from.

...

That might be a useful addition in a later version. I'll keep it in mind.

...

A public member function that starts with an inconvenient underscore, and has code deliberately hiding it from the documentation generator, should be a strong indication that it wasn't meant for general use, even to developers not familiar with that Python function-naming idiom.

...

xint::integer is a class, but according to the compiler, xint::integer_t<BOOST_XINT_APARAMS> is still a class template, and it won't let me make a typedef of it. When I try adding... typedef integer_t<BOOST_XINT_APARAMS> T; ...just below the class, GCC gives me twenty-four errors. The first few complain about invalid template arguments and unknown variables A0, A1, A2, etcetera (the contents of BOOST_XINT_APARAMS), the rest are mostly incomprehensible to a quick glance. If I try to satisfy those first few complaints like this... template<BOOST_XINT_CLASS_APARAMS> typedef integer_t<BOOST_XINT_APARAMS> T; ...I get "error: template declaration of ‘typedef’", followed by another eleven incomprehensible errors. Am I missing a syntactic trick that would let this work?

...

:-)

...

The functionality is exposed, and documented, via the extra Boolean parameter to the copy constructor.

...

Apparently so. I'm willing to agree to disagree on it. -- Chad Nelson Oak Circle Software, Inc. * * *

Steven Watanabe

9 Mar 9 Mar

7:42 p.m.

AMDG On 03/03/2011 09:33 AM, Stewart, Robert wrote:

...

The use of imperative programming to control what should be distinguished at compile time is disturbing. The runtime overhead may be provably insignificant, but I'd like to see that proved and documented. For example, most functions contain "if (Nothrow) { ...} else { ... }". That means there is a great deal of code carried in either case that never executes. The branches also mean there is more opportunity for bugs to creep into one variant and not the other. An application that only uses integer_t<options::nothrow> instances, there is a lot of code in those else branches that isn't needed. Far better would be for no_throw_integer to derive from integer_t. Changing this aspect, if performance and binary size tests show it to be important, may affect the set of types or interfaces in the library, and must therefore be done before acceptance.

Frankly, I don't see why this is important. I don't really care how the library selects the correct implementation. The interface is fine from a user's point of view. In Christ, Steven Watanabe

Stewart, Robert

8:19 p.m.

Steven Watanabe wrote:

...

Curious. I thought I listed quite a number of reasons that are important such as reducing the likelihood of bugs. Note two additional reasons: the current author may not be the only maintainer in the life of the library and library clients may find themselves stepping through the code when debugging problems. Both make simpler, less redundant code desirable. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

k.inaba

3 Mar 3 Mar

5:42 p.m.

This is just a quick question, not meant to be a full review. The documentation says that the complexity of multiplication is O(n^2). But as far as I know, there are asymptotically better algorithms such as Karatsuba method O(n^1.585) or FFT O(n log(n) log(log(n))), etc: http://en.wikipedia.org/wiki/Multiplication_algorithm Are there any reason that these methods aren't used in XInt? For large numbers, they work much faster than the textbook multiplication, and they can be implemented perfectly portably in C++. Kazuhiro Inaba.

Chad Nelson

4 Mar 4 Mar

6:12 a.m.

On Fri, 04 Mar 2011 02:42:32 +0900 "k.inaba" <kiki@kmonos.net> wrote:

...

From <http://www.oakcircle.com/xint_docs/r_interface_design_only.html>: It does not, at present, implement a multiplication scheme with better performance than the "basecase" mentioned in section 3.4 of n1692, primarily because I haven't had the time to research all of them yet. I suspect that most people using the library will usually be using numbers below the threshold where the more exotic algorithms are faster, but I plan to add at least one of them in the future, maybe several.

...

They only become faster for *really* large numbers, much larger than I generally work with -- on the order of tens of thousands of bits, if I remember correctly. For the more usual case of 4Kbit numbers at most, they have noticeably worse performance than the algorithm that's in there now. I've experimented with both Karatsuba and FFT, but after discovering that limitation, I abandoned them to concentrate on getting the rest of the library working right. As mentioned, I'll look at them again in the future. -- Chad Nelson Oak Circle Software, Inc. * * *

Jeffrey Lee Hellrung, Jr.

6:23 a.m.

On 3/3/2011 10:12 PM, Chad Nelson wrote:

...

The wikipedia article on Karatsuba's method suggests a switch from the grade-school algorithm to Karatsuba at around 320 to 640 bits: http://en.wikipedia.org/wiki/Karatsuba#Efficiency_analysis I don't consider an implementation of Karatsuba's method or any other asymptotically faster method to be necessary at present, but it might be surprising at what point the asymptotically faster methods are practically faster. - Jeff

Chad Nelson

6:45 a.m.

On Thu, 03 Mar 2011 22:23:11 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

Curious. That's not my experience, though the implementation I was trying to get working may well have been buggy.

...

It might indeed. I'll take another look at that when I have the time, though it might be a few weeks by the look of things. -- Chad Nelson Oak Circle Software, Inc. * * *

Vicente Botet

3 Mar 3 Mar

8:58 p.m.

Hi, This is not a review, just some thoughts. The Big Int library I would like to have in Boost should: * Separate between data representation and algorithms. Algorithms should work on concepts not on representation. * Provide specific data representation that is a model of fixed and unlimited Big Int Concepts. * Optional features as COW should not add anything when the user doesn’t requires them. * Make use of move semantics when available by using Boost.Move. * The implementation should avoid virtual functions as no needed in this domain. Instead the library should use the CRTP which is well adapted to this kind of abstractions. * The library should provide expression templates so the user can get the best performances she can expect. Or at least show that is open to be enhanced with expression templates with some basic examples. * A clear documentation including a good tutorial, examples and a reference document with something more than the C++ syntax. * Proving that the user can build other features on top of it in a non intrusive and efficient way. * That doesn’t reinvent the wheel. Best, Vicente -- View this message in context: http://boost.2283326.n4.nabble.com/xint-Boost-XInt-formal-review-tp3331338p3... Sent from the Boost - Dev mailing list archive at Nabble.com.

Gordon Woodhull

5 Mar 5 Mar

5:01 p.m.

On Mar 3, 2011, at 3:58 PM, Vicente Botet wrote:

...

There are no virtual functions in integer_t; I think Vicente meant virtual inheritance of integer_t_data. Virtual inheritance adds a little bit of overhead to member access - usually offsetting by a value stored in a virtual function table (which wouldn't otherwise be needed here). CRTP avoids this nicely. It is more efficient because the cast involved knows the offset statically whereas virtual inheritance only knows the offset at runtime. Trying to help bring this back to a technical discussion, Gordon

Chad Nelson

8:20 p.m.

On Sat, 5 Mar 2011 12:01:57 -0500 Gordon Woodhull <gordon@woodhull.com> wrote:

...

Ah, I wasn't aware of the overhead. Virtual inheritance was the first thing I saw that took care of the problem, and I didn't consider CRTP at the time I originally wrote that. Thanks for suggesting it. -- Chad Nelson Oak Circle Software, Inc. * * *

Barend Gehrels

5:06 p.m.

Hi, This is my little review of XInt. Thanks for creating this library.

...

- What is your evaluation of the design? - What is your evaluation of the implementation?

I didn't look in detail at the design nor at the implementation. Reasons are time constraints, and the fact that this is already discussed in much detail on this mailing list, and the fact that I don't know the internals of any integer library I'm using. So I cannot compare.

...

- What is your evaluation of the documentation?

The documentation is good. It seems complete and the texts are very readable. It contains enough samples to start using it. It contains enough reasoning. For an integer library, which you can normally start to use immediately, it is definitely good enough. Is it generated by Doxygen?

...

- What is your evaluation of the potential usefulness of the library?

Very useful.

...

- Did you try to use the library? With what compiler? Did you have any problems?

Yes, both MSVC Express 2005 and gcc 4.5.0 on mingw didn't give me any problems.

...

- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?

A few hours.

...

- Are you knowledgeable about the problem domain?

Yes, from user perspective. I'm using ttmath's large doubles in Boost.Geometry. Together with Bruno I did write a numeric adaptor (which is probably obsoleted now) using CLN and GMP. I've doubts about the inclusion of random generators within XInt. I consider it as out of scope. About Boost.Random, The documentation states "But, also as of Boost 1.43, it requires a compiled library", I don't have this impression. I always use Boost.Random by header only. This just came during writing this review:

...

The only competing library that matters, so far as I can see, is GMP.

I don't think this is the case. ttmath is really competing, header only, templated, etc. I've mentioned ttmath various times on this mailing list (e.g. here <http://lists.boost.org/Archives/boost/2010/02/161943.php> ). There is also CLN ( http://www.ginac.de/CLN/ ) I created a little test-program to check how easy it is to use, and how fast it is. I'm quite fond of ttmath and I do hope that it is sent for review to Boost at some point. So I compared with that. And, to compare more, I also compared with CLN. My little testprogram is pasted here: http://codepad.org/3cSn9GkZ It does some operations (*, +, >, =) and nothing more. And it measures the duration. I will not give exact duration numbers because of easy misinterpretation, I tested only two compilers, this is not the full behaviour, etc. etc. Just my summary is that XInt seems slightly slower than ttmath for me on gcc, and is about the same speed on MSVC 2005. So, for a brand new library written in five months (I'm referring to other mails now), I think the speed is acceptable or good. The CLN comparison I did only on gcc, and CLN turns out to be faster than both ttmath and XInt. So CLN's speed is competing, but CLN is not header-only and it has the wrong license (GMP) so in that sense it is not a real competetor. About my vote, this is difficult. I would really like ttmath be part of Boost. It has a BSD license. It is header only. Its floating point numbers are really good and according to my own test better (in precision) than either CLN or GMP. But ttmath is never submitted until now and I don't know if it is still planned. Besides that, it is not uncommon within Boost to offer two similar libraries. xint::integer is one thing, xdouble::double is missing. Is any floating point precision planned? If no, is it not inconvenient to have two completely different libraries for integer and FP? I really need a perfect Boost FP big number library... So I didn't decide about my definitive vote yet. Regards, Barend -- Barend Gehrels http://about.me/barendgehrels

Chad Nelson

9:26 p.m.

On Sat, 05 Mar 2011 18:06:40 +0100 Barend Gehrels <barend@xs4all.nl> wrote:

...

This is my little review of XInt. Thanks for creating this library.

And thank you for reviewing it.

...

...
- What is your evaluation of the documentation?

The documentation is good. It seems complete and the texts are very readable. It contains enough samples to start using it. It contains enough reasoning. For an integer library, which you can normally start to use immediately, it is definitely good enough. Is it generated by Doxygen?

Yes, it's Doxygen. Sorry for the lack of attribution, that has already been corrected for the next release.

...

I've doubts about the inclusion of random generators within XInt. I consider it as out of scope.

It *is* out of scope. But they're only included for convenience, and for use in the examples, they're not actually used in or by the library itself unless the person using it explicitly feeds one of them to it.

...

About Boost.Random, The documentation states "But, also as of Boost 1.43, it requires a compiled library", I don't have this impression. I always use Boost.Random by header only.

From <http://www.boost.org/doc/libs/1_46_0/more/getting_started/unix-variants.html> (and also on the Windows version of the same page): "Boost.Random has a binary component which is only needed if you're using random_device."

...

This just came during writing this review:

...
The only competing library that matters, so far as I can see, is GMP.

I don't think this is the case. ttmath is really competing, header only, templated, etc. I've mentioned ttmath various times on this mailing list (e.g. here <http://lists.boost.org/Archives/boost/2010/02/161943.php> ).

Apologies, I didn't join the mailing list until late March of last year, so I never saw that. I wondered why, when I asked for a preliminary review of XInt, someone made a comment about large-integer libraries all showing up at once. I hadn't seen ttmath before. From a quick glance, it looks very nice. I notice several differences between it and XInt, such as: * It stores numbers on the stack -- good for speed, bad only because the stack might impose an upward boundary on the number of numbers you can store. * It has a compile-time-set limit on the magnitude of numbers, which could be considered a plus (since logic errors in client code can consume a LOT of memory under XInt before you can shut the program down). * It has a floating-point component, which XInt presently lacks. * The stream support isn't quite as complete as XInt's. * It uses assembly language... good for speed, for a number of reasons, but bad for portability to other hardware platforms without a pure C++ fallback. * From the FAQ page, it looks like there's no documentation at present. I didn't delve into the code, so I don't know what "extras" it includes beyond a basic math class. I can't really compare it apples-to-apples to XInt. It has a different focus. Within that focus, I have no doubt that it's better than XInt... just from the prose descriptions, I can tell that it would beat XInt's current implementation in speed handily. But XInt has portability, documentation, and probably "extras" (like primality testing). No clear winner this round, try again after both have had another year to mature. :-)

...

There is also CLN ( http://www.ginac.de/CLN/ )

Hobbled by GPL licensing, and not header-only, but includes floating-point support and a different set of "extras" (XInt doesn't have things such as complex-number support built in), as well as implementing better large-number multiplication algorithms. In a head-to-head comparison, it probably beats XInt in almost every category. The licensing problem is a killer though.

...

[...] Just my summary is that XInt seems slightly slower than ttmath for me on gcc, and is about the same speed on MSVC 2005. So, for a brand new library written in five months (I'm referring to other mails now), I think the speed is acceptable or good.

XInt is only *slightly* slower that a library that uses assembly?! :-D

...

[...] About my vote, this is difficult. I would really like ttmath be part of Boost. It has a BSD license. It is header only. Its floating point numbers are really good and according to my own test better (in precision) than either CLN or GMP. But ttmath is never submitted until now and I don't know if it is still planned. Besides that, it is not uncommon within Boost to offer two similar libraries.

I see no conflict between them. ttmath is obviously better at some things, XInt at others. Though I think either of them could equal the other with a little more work.

...

xint::integer is one thing, xdouble::double is missing. Is any floating point precision planned? If no, is it not inconvenient to have two completely different libraries for integer and FP? I really need a perfect Boost FP big number library...

I'd planned on it (as mentioned in the second paragraph here: <http://www.oakcircle.com/xint_docs/zero.html>), either myself or helping a GSoC project aimed that way. But if the author is willing to proposed ttmath to Boost, I wouldn't be at all disappointed to concentrate on the integer side of things. That's where my main fascination lies.

...

So I didn't decide about my definitive vote yet.

Regardless of your final decision, thanks for reviewing it. But of course, I hope you decide in XInt's favor. :-) -- Chad Nelson Oak Circle Software, Inc. * * *

Barend Gehrels

6 Mar 6 Mar

2:22 p.m.

Hi, This is my second part of the review, now including a vote. Chad, thanks for all your answers on the first part. On 5-3-2011 22:26, Chad Nelson wrote:

...

Sure, we cannot read all mails. I didn't catch all mails about XInt and therefore probably didn't put ttmath into those threads.

...

That is indeed the case. I didn't wrote this yesterday, but if you select its template parameter too small, the results fail. That might be prevented (I don't know) but the code as I presented it just gives the wrong results. So in that sense, XInt behaves much better.

...

That is indeed a drawback.

...

Good to know.

...

Yep, I definitely hope that there will be a FP counterpart too, written either by you or with your help, of by the ttmath author, or both of you. But the scope of this review is integer.

...

And now my final decision, I vote for YES for inclusion of XInt into Boost. The doubts I had yesterday were mainly about ttmath, but voting no because of a library which might ever been reviewed, or maybe never, and is not better in all aspects, seems now senseless to me. I think XInt is a very useful library. The author has indicated to solve some issues such as virtual inheritence -> CRTP, and maybe COW, as described in many other mails. I think the performance, currently acceptable, will profit from that (to some extent). In summary, I think XInt will be a valuable addition to the Boost collection. So thanks again for writing this library and submitting it for review. -- Barend Gehrels http://about.me/barendgehrels

Chad Nelson

2:46 p.m.

On Sun, 06 Mar 2011 15:22:52 +0100 Barend Gehrels <barend@xs4all.nl> wrote:

...

This is my second part of the review, now including a vote. Chad, thanks for all your answers on the first part.

My pleasure! [...]

...

And thank you again for the review, and the comments. -- Chad Nelson Oak Circle Software, Inc. * * *

Ivan Sorokin

5 Mar 5 Mar

11:10 p.m.

On 02.03.2011 13:58, Vladimir Prus wrote: I believe that arbitrary precision arithmetic library is very useful. But I don't think that in it's current status XInt is appropriate for boost. Below is the list of features which I believe should be improved. The list is sorted by importance. 1. Parametrize with container type. Currently the library uses several options like fixedlength, secure, threadsafe, etc. I am sure it would be better to add template parameter "container" to integer_t. The container will take care about how to store data. So one can have * integer_t<std::vector> (for generic implementation), * integer_t<boost::array> (for fixed-length), * integer_t<vector_fixed_capacity>, * integer_t<very_secure_vector> (for very secure arithmetic), * integer_t<copy_on_write_vector> (if one want copy on write optimisation) * etc This solution requires the container interface to be extended through global functions(see below), but I think that finally we get very extensible and easy to use integer_t. For example: xint::push_back is used in operator+ and operator- when we need to add most significant digit. * for std::vector it does simple push_back * for boost::array it may do nothing (silent overflow behavior) or throw an exception (if we don't want overflow to happen) xint::trim is used when we need to trim most significant zeros. * for std::vector it does resize * for boost::array it does nothing Another option could be wrapping containers into some entity with interface suitable for integer_t needs. One more thought about "secure" option. There are no standard containers with this option. I think either both standard containers and integer_t should have such option or both shouldn't have. This is yet another argument for container parametrization. 2. Signed Zero. I'm sure that long integer arithmetic should behave as close to native integer numbers as possible. Native integer types don't have negative zero. As the documentation mentions, this is done "for use in a planned future unlimited-precision floating point library". Although the only function that behaves differently on negative and positive zero is sign(true), I still believe that this is redundant and planned floating point library should use it's own sign bit. 3. Alternative representation of negative numbers Currently integer_t supports bit arithmetic, but its result differ from one of native integer types: template <typename Integer> void f() { Integer a = 1; Integer b = -1; std::cout << a << " " << b << " " << (a ^ b) << std::endl; } f<boost::xint::integer>() prints 1 -1 0 But f<int>() prints 1 -1 -2 I propose the following solution. Currently integer_t stores the sign bit and value bits (absolute value). Instead of the sign bit let integer_t hold a bit "inverse_value" with the following meaning: True inverse_value means that all "value" bits are inverted. Even bits, higher than the most significant value bit. For example: We want to represent -3 (decimal): binary representation for -3 is "...111...101" (unlimited number of 1 in high bits), it corresponds to setting inverse_value bit to 1 and inverting all representation value bits, that is "...000...010" or simple "10". So * for negative number inverse_value is set to 1, and container store its bitwise not, * for positive number inverse_value is set to 0, and container store its value as is. The implementation of addition and subtraction for this representation is not more complicatedthan for current one. But it allows bit arithmetic to work in the same fashion as it works with native integer types. 4. About options naming. The option name "threadsafe" is misleading as it implies something about concurrency. Instead it means just disabling copy-on-write. So I think it's better to rename it to something like "no_copy_on_write" or "copy_on_write<false>" or something like that. xint::nothrow_integer really is "xint::integer or nan". Maybe the name that reflects "integer or nan" representation is better?

Jeffrey Lee Hellrung, Jr.

6 Mar 6 Mar

9:44 p.m.

[Summary is given in the response to 3., in the beginning, for those who don't wish to read the entire post.] On 3/2/2011 2:58 AM, Vladimir Prus wrote:

...

Boosters,

the formal review of the Boost.XInt library, by Chad Nelson, starts now and will run through March 11.

The documentation for the current version is available at:

http://www.oakcircle.com/xint_docs/

The source code is available via Subversion at:

http://svn.boost.org/svn/boost/sandbox/xint

or can be downloaded at:

http://www.boostpro.com/vault/index.php?action=downloadfile&filename=xint.zip [...] 3. Please explicitly state in your review whether the library should be accepted.

No, XInt in its current incarnation should not be accepted. Yes, it would be great for boost to formally endorse a bigint library. However, it's not like the XInt library will be unavailable for others to use if not accepted into boost. It seems like a lot of people's major reason for accepting XInt is that "we desperately need a bigint library in boost". Huh? Maybe I'm making a straw man argument here, but just because something isn't in boost, doesn't make it unusable (although I concede it makes it, perhaps, slightly less usable). Based on my evaluation below, XInt has quite a lot of changes that I believe it needs to go through before I would consider it acceptable for inclusion in a boost release. These include some breaking interface changes. I would like to see the implementation of said changes, or rationale for rejecting said changes, before I vote for even conditional acceptance. I understand, however, that my contribution to the boost community outside of occasional participation on the developers mailing list is scant, so I wouldn't believe it unfair to weight my vote appropriately, as the review manager sees fit.

...

4. The general review checklist is provided below:

- What is your evaluation of the design?

As far as the interface of xint::integer_t *specifically* is concerned, it is dominated by the expected constructors and expected operator member functions and operator free functions. There isn't really much to evaluate here. A few of the non-operator functions kind of stick out, though: - boost::xint::integer_t<>::integer_t ( const integer_t< other > & b, bool force_thread_safety = false ): This awkwardly exposes the COW implementation details. If this functionality remains stays, this constructor should be removed and the _make_unique member function should be documented as part of the public interface. - is_odd, is_even, sign, hex_digits, is_nan: These are all member functions with no same-named free function...which makes me wonder why these were chosen to be member functions while the likes of getbit, setbit, is_prime, etc. were chosen to be strictly free functions. - pow2, factorial, nan: Same deal. I don't see a problem with making these free functions and specifying the desired return value as an explicit template parameter. It will look the same but make the interface more uniform: typedef integer_t< ... > integer_type; integer_type x = integer_type::pow2(n); vs typedef integer_t< ... > integer_type; integer_type x = pow2< integer_type >(n); - random_by_size, random_prime: These shouldn't be part of the interface of xint at all; more on this below. The Boost.Parameter interface to the xint::integer_t template is fine, although Boost.Parameter *might* be overkill. I was trying to find in the source code where the template parameters fed to the xint::integer_t template are interpreted, but failed. I can't seem to find where xint::detail::integer_t_data is defined, nor any of the other base templates for xint::integer_t. I would think they'd be in one of the first 5 includes (not including exceptions.hpp and random.hpp) of detail/internals.hpp. This gives me the impression that the implementation, from a source file perspective, is badly designed, but I might just be dense. In any case, as far as I know (that is, Chad hasn't given me a specific example), XInt doesn't use the parameter deduction feature of Boost.Parameter, so one could consider using a Boost.MPL map to pack template parameters together. This gives a slightly clunkier interface, but it avoids the need for the user to set BOOST_PARAMETER_MAX_ARITY, so there's definitely a tradeoff. The random number stuff and the cryptographic-specific stuff should not be included within XInt. These should be left to other libraries (Boost.Random in the former case). I think the scope of a bigint library is very clearly to provide arithmetic algorithms and data structures on which the algorithms operate. This does not include cryptographic functionality. As I've noted before, I'm disappointed that the algorithms Chad's used to implement the arithmetic operations are hidden behind the xint::integer_t class. Given the richness of these algorithms, I would consider it a priority to expose them to the public through a generic interface. Much has already been said about this both in the last week and even earlier, and if Chad would like some help in this regard, I would be willing to assist. Part (albeit a small part) of my reason for not accepting XInt is, by Chad's own admission, that he wants the interface to settle down before separating the algorithms out (if I'm misinterpreting your comments, Chad, please correct), suggesting the interface hasn't been finalized yet. Also, I agree with Robert Stewart that the integer_t template should be renamed basic_integer, a la basic_string. The fixed-size integers leave something to be desired, but as Chad himself has emphasized, they are second-class citizens. So-called second-class citizens probably should not included within the library, and if they are, not without huge warnings saying something like "EXPERIMENTAL" or "SUBJECT TO CHANGE". I think a reasonable road map, given Chad's priority to address arbitrarily-sized integers first, is to scrap the fixed-size integers for the moment (let's concentrate on one thing at a time!), and factor out the arithmetic algorithms in such a way that they work on fixed-size integer concepts (including a 2's complement implementation) equally well. At some later point, then, include a policy-based data structure for fixed-size integers that addresses everyone's various desires in how a fixed-size integer class should behave.

...

- What is your evaluation of the implementation?

Chad already knows my feelings on COW, but to be explicit: I'm not necessarily against use of COW in general, but I have yet to be given a use case where COW would outperform move emulation in theory. And I don't fully trust the benchmarks that Chad has provided comparing COW with move emulation, given his admission of possible errors. For example, I believe after the first set of benchmarks, it was discovered that some COW idioms were still being used internally when COW was turned off and move emulation enabled; I could be remembering wrong, though. In any case, aside from whether COW is "good" or "bad", if one wants COW, it would probably be best built on top of a non-COW basic integer class. It is disturbing to me that the COW infrastructure still exists (and even more disturbing that it is still used internally!) when COW is supposedly disabled. Based on others' reports on the implementation, there appears to some amount of bloat with certain pieces of the data structure existing but remaining unused, depending on which policies are enabled. I consider this a weakness of the implementation. Multiplication and square roots (and possibly some other algorithms) could likely be drastically improved from a complexity standpoint. I know the reasons for not doing this are covered in the documentation, but it is still a weakness in the implementation nonetheless.

...

- What is your evaluation of the documentation?

I have never liked the documentation that Doxygen produces as much as the documentation in most Boost libraries (presumably generated with BoostBook (sp?)), but aside from that, the documentation is incomplete in some places, as I and others have noted already. The "fast" claim on the primary documentation page is unsubstantiated, and, according to Chad, somewhat unsubstantiatable (assuming that's a word). I don't think any claims about performance should be made *at all* without some kind of benchmark or statistics to justify it. Any library will be fast on a sufficiently fast (perhaps idealized) computer. My evaluation of the documentation is the briefest, as I expect it will change at least as much as the interface and implementation.

...

- What is your evaluation of the potential usefulness of the library?

Oh, don't get me wrong, very useful indeed, and the library does exactly what it says it does. I just disagree with how it effects that.

...

- Did you try to use the library? With what compiler? Did you have any problems?

No, I did not.

...

- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?

I've probably paged through all of the documentation by this point, accessed several of the source files through the documentation interface, and discussed the various aspects of XInt at great lengths on the boost developers mailing list. So I believe this is a relatively in-depth study.

...

- Are you knowledgeable about the problem domain?

Strictly as far as the arithmetic algorithms, yes, I think so. As far as the cryptographic applications, not so much. - Jeff

Chad Nelson

7 Mar 7 Mar

12:25 a.m.

On Sun, 06 Mar 2011 13:44:37 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote: Thank you for the review.

...

...
3. Please explicitly state in your review whether the library should be accepted.

No, XInt in its current incarnation should not be accepted. Yes, it would be great for boost to formally endorse a bigint library. However, it's not like the XInt library will be unavailable for others to use if not accepted into boost. [...]

I have to respectfully disagree. One company I work with is willing to use Boost itself, but not libraries from the Boost sandbox. I myself am leery of anything from the sandbox, for interface stability reasons... if upgrading a library may mean that I suddenly have compiler errors in previously stable and working code, and have to spend more time correcting it, I'd prefer to look elsewhere. XInt's existing interface could be frozen at its current state with no problem. Even the removal of the copy-on-write stuff would only result in one function having an unused Boolean parameter. Any additional functionality can easily be added in a way that doesn't break existing client code.

...

- is_odd, is_even, sign, hex_digits, is_nan: These are all member functions with no same-named free function...which makes me wonder why these were chosen to be member functions while the likes of getbit, setbit, is_prime, etc. were chosen to be strictly free functions.

Because n1692, the paper presented to the C++ Standards Committee in 2004 that I based XInt's interface on, has them there. I saw no point in cluttering the interface by adding free functions that just call those, although if that would satisfy this objection, it can easily be done.

...

- pow2, factorial, nan: Same deal. I don't see a problem with making these free functions and specifying the desired return value as an explicit template parameter. It will look the same but make the interface more uniform: typedef integer_t< ... > integer_type; integer_type x = integer_type::pow2(n); vs typedef integer_t< ... > integer_type; integer_type x = pow2< integer_type >(n);

I could easily argue either side of that one. When I originally wrote that code, the current method seemed ever-so-slightly more appealing; now I think the way that you're advocating might be. If anyone else is reading this, please weigh in: do you prefer the current placement of these functions as static members, or moving them to free functions? The sole difference is a slight change in the syntax used to call them, they'd be equally efficient either way.

...

The Boost.Parameter interface to the xint::integer_t template is fine, although Boost.Parameter *might* be overkill. I was trying to find in the source code where the template parameters fed to the xint::integer_t template are interpreted, but failed.

No surprise, the code is pretty confusing. It's through integer_t's base classes, such as nan_functions, in detail/options.hpp.

...

I can't seem to find where xint::detail::integer_t_data is defined,

Same file, detail/options.hpp.

...

nor any of the other base templates for xint::integer_t. I would think they'd be in one of the first 5 includes (not including exceptions.hpp and random.hpp) of detail/internals.hpp. This gives me the impression that the implementation, from a source file perspective, is badly designed, but I might just be dense.

It's actually the *last* include, the lonely one at the bottom of internals.hpp, because one of its classes needs the get_nan function defined in that file.

...

In any case, as far as I know (that is, Chad hasn't given me a specific example), XInt doesn't use the parameter deduction feature of Boost.Parameter,

integer_t_data should clear that up. If I understood your explanation of what you meant, I don't think the class could accept an allocator in any arbitrary position without it, though I could well be wrong.

...

so one could consider using a Boost.MPL map to pack template parameters together. This gives a slightly clunkier interface, but it avoids the need for the user to set BOOST_PARAMETER_MAX_ARITY, so there's definitely a tradeoff.

One I would by far prefer not to make. Clunky interfaces are what I was trying to get away from with this library.

...

The random number stuff and the cryptographic-specific stuff should not be included within XInt. These should be left to other libraries (Boost.Random in the former case).

It is. Of the two random-number classes in the code, one is just a wrapper around a fast Boost.Random class that provides a default value for convenience, the other implements the same abilities as Boost.Random's random_device class in a form that's slightly easier to work with in the examples (Boost.Random couldn't handle that at all under Windows when I originally wrote it). Client code must specify a random number generator for the few (two, I think) functions that can use one, and they work equally well with Boost.Random's classes. I'm not wedded to the default_random_generator. It's no longer needed by any code in the library, and could be deleted without any other changes. But the strong_random_generator is significantly more convenient for the examples, and somewhat more convenient for casual client code use, as it doesn't require a compiled library to use as Boost.Random's does. The random number functions in the library are there to make it easier to generate large random numbers, which of course Boost.Random can't natively support. That's a legitimate mathematical purpose, one that would have to be reimplemented by a significant number of programs using the library if it were removed.

...

I think the scope of a bigint library is very clearly to provide arithmetic algorithms and data structures on which the algorithms operate. This does not include cryptographic functionality.

This might be pedantic, but there really *isn't* any cryptographic functionality in the library. There are only mathematical functions. is_prime (or its new name, is_probably_prime) is most useful in the field of cryptography, because huge prime numbers are of particular interest in that field, but prime numbers have a lot of uses outside of cryptography as well. is_prime is no more a cryptographic function than pow or gcd is. If I had to remove functions that are primarily interesting to cryptography, powmod would have to be on the list too, and it's of even more general use than is_prime.

...

As I've noted before, I'm disappointed that the algorithms Chad's used to implement the arithmetic operations are hidden behind the xint::integer_t class. Given the richness of these algorithms, I would consider it a priority to expose them to the public through a generic interface. Much has already been said about this both in the last week and even earlier, and if Chad would like some help in this regard, I would be willing to assist.

Thank you. I may take you up on that at some point, or at least ask your opinions on various options, but I think I see how to do it in general.

...

Part (albeit a small part) of my reason for not accepting XInt is, by Chad's own admission, that he wants the interface to settle down before separating the algorithms out (if I'm misinterpreting your comments, Chad, please correct), suggesting the interface hasn't been finalized yet.

I may have misspoken and said interface at some point, but my intention was to let the *internals* settle down before providing access to them. The justification is that opening up access to them now would tie my hands, because then I couldn't make changes to them without breaking client code, which is the last thing I ever want to do.

...

Also, I agree with Robert Stewart that the integer_t template should be renamed basic_integer, a la basic_string.

As do I, and it's already high on the to-do list.

...

The fixed-size integers leave something to be desired, but as Chad himself has emphasized, they are second-class citizens. So-called second-class citizens probably should not included within the library, and if they are, not without huge warnings saying something like "EXPERIMENTAL" or "SUBJECT TO CHANGE". [...]

Eh? Only the internals of them are subject to change. The current interface can be preserved without any loss of functionality, and added onto without breaking anything.

...

...
- What is your evaluation of the implementation?

Chad already knows my feelings on COW, but to be explicit: I'm not necessarily against use of COW in general, but I have yet to be given a use case where COW would outperform move emulation in theory.

I've located two: abs and negate. I believe those were the functions that originally prompted me to use it. I'm sorry that I didn't mention them before; I knew I had a reason for going to all the trouble of using it, but I didn't recall what that reason was until I saw those while working with the code today. Think about it. Both involve changes to the sign bit only, and one only sometimes needs to alter it. It would be foolish to deep-copy a ten-thousand-bit or larger number just to flip the sign bit. Even worse, not to do anything to it at all, in the case of abs and already-positive numbers. They may seem like minor-league players, and completely unworthy of having such a huge influence on the design, but they're used heavily in many low-level parts of the library (where, I might add, it couldn't easily be layered on as an optional afterthought). Addition and subtraction make heavy use of them, and they're integral parts of division, the GCD algorithm, the pow and powmod functions, and if memory serves a lot of other functions (via unary operator-) that I can't easily locate by a search. Move emulation does little for those, because in many cases you're *not* moving the magnitude. You're just passing that same magnitude around with possibly-different signs. I haven't delved into the functions that use them, but as I recall, at least some of them need multiple representations of a number (like the original and absolute values) in different parts. CoW provides a provable benefit there. I haven't yet proven that CoW is justified. I'm not quite to the point I can test it yet, and it may well turn out that it provides only a negligible benefit in real-world uses. But at least in theory, there's a very good reason for it, with or without move emulation.

...

[...] It is disturbing to me that the COW infrastructure still exists (and even more disturbing that it is still used internally!) when COW is supposedly disabled.

You make it sound like I was trying to hide it. The behavior is clearly explained on the "Copying and Argument-Passing" page of the documentation. It should probably be on the "threadsafe vs. copy_on_write" page too, but it didn't occur to me. The only concrete complaint that people could give me about CoW was that code using it couldn't be made thread-safe, which is true. So I made a way to instruct the library to force all integer objects to have unique storage before they're returned to user code, and made it the default to prevent unpleasant client-code surprises. The internal code is single-threaded, and completely under my control in any case if I find it useful to make it multithreaded someday, so that answered the thread-safety complaint. The only other complaints people had was that they didn't like it, which is not a valid reason to change it, and that move-emulation makes it irrelevant, which it doesn't in this case.

...

Based on others' reports on the implementation, there appears to some amount of bloat with certain pieces of the data structure existing but remaining unused, depending on which policies are enabled. I consider this a weakness of the implementation.

I indicated that it was unused once, early in the review, but I was wrong. It's always used at present.

...

Multiplication and square roots (and possibly some other algorithms) could likely be drastically improved from a complexity standpoint. I know the reasons for not doing this are covered in the documentation, but it is still a weakness in the implementation nonetheless.

Yes, though an easily corrected one. There are more important changes to make, but I'll get to them as soon as I can.

...

The "fast" claim on the primary documentation page is unsubstantiated, and, according to Chad, somewhat unsubstantiatable (assuming that's a word). I don't think any claims about performance should be made *at all* without some kind of benchmark or statistics to justify it. Any library will be fast on a sufficiently fast (perhaps idealized) computer.

As I've said before, I'm not claiming that it's "faster" or "fastest," which are legitimately questionable claims. I'm saying that it's fast, as a way to conversationally lead in to the explanation that portability is more important, but reassure the reader that speed is still an important design criteria. In any case, thank you for the review, and though I disagree with many of them, the comments as well. -- Chad Nelson Oak Circle Software, Inc. * * *

Peter Dimov

1:01 a.m.

Chad Nelson wrote:

...

The only concrete complaint that people could give me about CoW was that code using it couldn't be made thread-safe, which is true.

Can you please provide more details? Why is not CoW thread safe?

Jeffrey Lee Hellrung, Jr.

1:28 a.m.

On 3/6/2011 5:01 PM, Peter Dimov wrote:

...

I *can* be made thread-safe if you use atomic operations (as far as I know; someone correct me if this I'm wrong here). However, that likely introduces overhead even when you only need single threaded usage. If you don't use atomic operations, then you have problems when multiple threads try to fiddle with the reference count, leading to the potential for dangling memory blocks and/or double deletions. COW can introduce a performance hit with every in-place modifying operation, as you have to ensure sole ownership (so it requires a possible branch; you can't count the deep copying, of course, as that would happen anyway without COW). I don't know how many operations in *this* library fit that category, however; enough to justify the existence of a _make_unique member function? I would also expect this overhead to be negligible to the usual arithmetic operations once your integers reached a certain size... - Jeff

Peter Dimov

2:25 a.m.

Jeffrey Lee Hellrung, Jr. wrote:

...

Well, it doesn't really matter whether it introduces overhead compared to unsafe CoW if we operate under the constraint that the library must be thread safe (which, quite frankly, it must be in this day and age). The competition there is only between CoW with atomic inc/decrements and deep copy. Move, by the way, eliminates an atomic increment and decrement in the same cases in which it would eliminate a deep copy. I probably should note that expression templates require CoW, as far as I can see; if you return an expression from a*b instead of computing the product, the expression needs to keep a and b alive. Having it store copies would defeat the purpose. Storing references wouldn't work well in generic code such as a forwarding function. Even something as simple as auto x = f() + y; would be a subtle bug. I wouldn't worry much about the branch in this context. :-) CoW aside, I do worry about the apparent overtemplatization of the library (possibly as a result of Boost feedback) and the calls that it isn't templatized enough. There are many potential users of an unlimited precision integer library who do not care in the slightest about implementation details and just need an integer class that provides the appropriate operations. For the people who do care, the library should provide free function templates that operate on a well-defined Integer concept which is "operationless" on its own and only stores bits (in chunks of some element_type perhaps). The supplied 'integer' class would be our best effort of implementing this concept in the manner we think is best for the users who want us to make this decision for them. The rest should be able to define their own integer classes and get the operations "for free".

Chad Nelson

5:12 a.m.

On Mon, 7 Mar 2011 04:25:00 +0200 "Peter Dimov" <pdimov@pdimov.com> wrote:

...

That's what I'm aiming for in the long run, as a result of the feedback I've gotten during the review. -- Chad Nelson Oak Circle Software, Inc. * * *

Peter Dimov

5:01 p.m.

Chad Nelson wrote:

...

If that's the case - and if I were you - I would seriously consider withdrawing the library now. (This is not a "no" vote.) By trying to satisfy all people's wishes, you've made the library very complex, introducing policies and Boost.Parameter use. An architecture such as the one above would have allowed you to counter every "but I need an integer that does X instead" objection with "write your own, using example 2a as a starting point; it's easy" and would have kept the basic integer class clean. If the library enters Boost now, you may never be able to get back to this point, because for backward compatibility you'd be forced to maintain the current design. But the decision is yours and if the choice is between getting the current library into Boost and abandoning it in frustration - because having to throw a few months of work away is not very motivating - then it's obviously better to get it into Boost because it certainly looks good enough.

Joel Falcou

6:34 a.m.

On 07/03/11 03:25, Peter Dimov wrote:

...

Christopher Jefferson

10:48 a.m.

On 7 Mar 2011, at 06:34, Joel Falcou wrote:

...

So users of a expression-template based xint would need to know about proto::deep_copy? Also, I would expect these problems to become more serious in C++0x code, because users will probably write expressions like: auto i = x + y; Which would be fine for built-in types, and introduce bugs if x or y was an xint::integer, using proto. Chris

Joel Falcou

12:44 p.m.

On 07/03/11 11:48, Christopher Jefferson wrote:

...

auto i = x + y; does what it need to do, build an expression type. The only problem will arise if x and y has a shorter life span than i, which is usually not the case in random code. When does it happen ? When someone do something like: auto f( X const& x) -> decltype(x+x*declval<X>()) { auto that = x + x*X(); return that; } In this case, problem arise as X() wont be there out of f(). Note however than: auto f( X const& x) -> decltype(x+x*x) { auto that = x + x*x; return that; } do work as intended by passing along the locally constructed AST to build a larger one at calling site. Now, my guess is that const ref lifespan propagation still apply in 0x, so doing somethign along: auto f( X const& x) -> decltype(x+x*declval<X const>()) { X const local; auto that = x + x*local; return that; } will surely works (more or less, untested code). And if not, there is the only case where some deep_copy *may* be needed. And all in all, in these particular case, i wonder if a c++0x enabled proto wont just be able to have rvalue-ref expression nodes that let it fly properly. Applying 0x logic to 03 based code is not very correct. I'll keep this issue in mind when we'll look at porting proto in 0x, and in 0x, proto will surely be a totally different beast inside.

Sebastian Redl

1 p.m.

On 07.03.2011 13:44, Joel Falcou wrote:

...

Nope, won't work. Lifespan extension extends the life of temporary objects to the life of the concrete reference they're bound to. 'local' is not a temporary, and the concrete reference it is bound to is the argument of the * operator in x*local, which is gone before 'that' is even initialized. The problem with the whole thing is that the lifetime becomes complicated, so it's very hard for a C++ programmer to see if a piece of code is correct. Sebastian

Joel Falcou

1:10 p.m.

On 07/03/11 14:00, Sebastian Redl wrote:

...

That's what I thought liek 10ns after hitting send.

...

Now, is it a proto problem or a 0x problem ? i think the latter

Sebastian Redl

1:21 p.m.

On 07.03.2011 14:10, Joel Falcou wrote:

...

Sebastian

Eric Niebler

1:16 p.m.

On 3/7/2011 7:44 PM, Joel Falcou wrote:

...

... and because ...

...

No, it doesn't. The intermediate temporary nodes will also be stored by reference, and those immediately go out of scope. "that" is already invalid when you try to return it. It's even more invalid (ok, equally invalid) when f returns.

...

It remains to be seen what, if anything, can be done to make expression templates and "auto" play nicely in C++0x. But I suspect the answer is: nothing. Joel, I'm honored by your enthusiasm for Proto, but in this case, I think it's misplaced. For a simple value type like an infinite precision integer, expression templates are just plain dangerous. I could imagine a design that uses a) the small object optimization for ints below a certain threshold, and b) a ref-counted handle/body idiom for everything else (like xpressive's regex objects), where the body could be either i) a simple big-int terminal or ii) a type-erased expression template that keeps its referenced bodies alive until an evaluation is needed, and so avoid unnecessary allocations for intermediate expressions. It must be done such that adding two big-ints yields a big-int. "auto i = x + y;" simply MUST work, always, though it need not actually do any addition. But I haven't thought it all through, so that could all be bollocks. :-) -- Eric Niebler BoostPro Computing http://www.boostpro.com

Joel Falcou

4:39 p.m.

On 07/03/11 14:16, Eric Niebler wrote:

...

Again, the question is what people expect auto to do in this case.

...

I actually disagree. For me it depends of the underlying representation. If the integer rep itself is static and dont incur any allocation, then yeah, let's play with ref-count or NRVO all along. I still think that for a dynamic sized large int, it is basically a subset of the "ET for matrix" problem. And even for static sized large integer, what about pattern matching for local optimization ? Then again, I can be the one dealing bollocks atm.

...

I agree but in this case, you still use "expression template" in a type erased context.

...

Yeah, i wonder if the expression template stuff using recursive variant in the old vault can't be what we need. It is basically that: ET with Type Erasure that keep everything proper.

SATAN66613

7:27 p.m.

on 07.03.2011 at 13:48 Christopher Jefferson wrote :

...

So users of a expression-template based xint would need to know about proto::deep_copy?

...

Also, I would expect these problems to become more serious in C++0x code, because users will probably write expressions like:

...

auto i = x + y;

...

Which would be fine for built-in types, and introduce bugs if x or y was an xint::integer, using proto.

could we prevent this misusage by making copy constructor of the expression class private (possibly never implementing it)? -- Pavel P.S. if you notice a grammar mistake or weird phrasing in my message please point it out

Sebastian Redl

8 Mar 8 Mar

9:36 a.m.

...

On 07.03.2011 20:27, SATAN66613 wrote: think of any reason why that wouldn't work. Sebastian

Christopher Jefferson

9:54 a.m.

On 8 Mar 2011, at 09:36, Sebastian Redl wrote:

...

That was my first thought. I'm not particularly knowledgable about boost::proto, so I don't know how much disabling the copy constructor would limit the functionality. If it didn't create too many problems it is probably the way to go, possibly with a 'clone' function if that would be useful for people really want to make a copy, and know what they are doing. Chris

Chad Nelson

7 Mar 7 Mar

4:51 a.m.

On Mon, 7 Mar 2011 03:01:51 +0200 "Peter Dimov" <pdimov@pdimov.com> wrote:

...

Because the copy-on-write functionality is entirely internal to the class. Nothing outside it can tell whether it's safe to modify an integer or not, in the presence of multithreading. Even the library itself can't tell unless I add locking for each integer object, which would slow things down and introduce a compiled dependency for little benefit. -- Chad Nelson Oak Circle Software, Inc. * * *

Peter Dimov

4:43 p.m.

Chad Nelson wrote:

...

Unless you allow non-const references or iterators to your shared data, code outside can't modify it without you being aware of it, and CoW is transparent (assuming atomic reference counting). You only need your integers to be "as thread safe as an int" (or, equivalently, as a deep copy integer). This means that: (1) in a function that reads an integer instance you can assume that no other thread writes it; (2) in a function that writes an integer instance you can assume that no other thread reads or writes it; (3) in a function that writes an integer no other thread reads the data, because you've just made the data unique, and for it to become non-unique another thread must copy the instance, see (1) (4) in a function that reads an integer no other thread writes the data, for similar reasons.

Chad Nelson

6:22 p.m.

On Mon, 7 Mar 2011 18:43:53 +0200 "Peter Dimov" <pdimov@pdimov.com> wrote:

...

That might be assuming too much, for the moment. I'm not sure of the current status of Boost.Atomic, but I know enough about the topic that I don't trust my ability to implement it myself properly. Once that library is official, I'll be happy to include and use it. -- Chad Nelson Oak Circle Software, Inc. * * *

Simonson, Lucanus J

1:43 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Chad Nelson wrote:

...

Chad, This is not true, I made a concrete complaint about why I don't like COW or move-emulation for performance reasons. " Pass by reference is what real programmers who deal and maintain real code do. Rather that the false dicotomy between between COW and move lets answer the more fundamental question, why either? Just use expression templates. Don't bother to use proto for your expression templates, by the way, it is very easy to do yourself in this simple case. This whole argument misses the point of optimizing code that uses a bigint class. You want to reuse the memory allocated in a bigint variable over and over instead of reallocating every time. Whether you make one unneeded allocation or two is a less important consideration than whether you make one or zero since 2/1 is a lot smaller than 1/0. If a temporary is generated by the user's code they have already pessimized their code. I am sorry, but you cannot avoid allocation with a return by value semantic, only expression templates allows you to reuse the storage on the left hand side of the assignment operator. Return by value is broken whether you use COW or MOVE so how about we focus on fixing it with a nice expression template based operator syntax using pass by reference and documentation that tells the user to avoid generating temporaries if they want to write code that will run as fast as possible. If they want to write expressions that generate temporaries it will be *no slower* than the current proposal, but at least they will have the option of writing code that is fast. " Regards, Luke

Peter Dimov

2:37 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Chad Nelson wrote:

...

I don't think that this is enough to prevent surprises. If someone does thread t( f, x ); where x is an integer, the thread would receive a shared copy of x. The user would need to explicitly call the copy constructor with a second argument of true to avoid that. Just use detail::atomic_count until C++0x atomics arrive in numbers. :-)

Chad Nelson

7:21 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Mon, 7 Mar 2011 04:37:40 +0200 "Peter Dimov" <pdimov@pdimov.com> wrote:

...

You'd think so, but so long as that thread class takes the integer parameter as either a constant reference or by value, or is the only one thread accessing it if it's passed by non-constant reference, that's not the case. The copy constructor will automatically deep-copy the object, unless the client code explicitly tells it to use external copy-on-write -- that's the only time the second argument to the copy constructor would come into play.

...

Just use detail::atomic_count until C++0x atomics arrive in numbers. :-)

Not sure where I'd find detail::atomic_count. Has Boost.Atomic been accepted? I don't see it in the 1.46 docs. -- Chad Nelson Oak Circle Software, Inc. * * *

Ivan Le Lann

8:45 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Le lundi 07 mars 2011 à 02:21 -0500, Chad Nelson a écrit :

...

I'm not sure I understand why you say that t does not get a shared copy. Anyway what seemed obvious to me with current impl is that block below can either leak or double destruct x data. Am I wrong? void f (integer) {} // by value ... { integer x = 42; async ( f, x ); async ( f, x ); } ... Ivan

Chad Nelson

2:33 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Mon, 07 Mar 2011 09:45:02 +0100 Ivan Le Lann <ivan.lelann@free.fr> wrote:

...

It depends on the implementation of async. If async takes x as a const reference, then it *would* be shared between them, but presumably they would finish with it before 'f' exits... otherwise, passing a local to them is a client-code design bug. If it's a non-const reference, and both calls make changes to it, that's another client-code bug that the library can't do much about. If async passes them by value, then there's no trouble. Unless you explicitly tell the library to use CoW on external objects, it will deep-copy them, so each call to async will get its own storage. So if the client code is written to avoid bugs, the way that it would have to be with any kind of object, XInt will do the right thing. -- Chad Nelson Oak Circle Software, Inc. * * *

Peter Dimov

5:22 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Chad Nelson wrote:

...

Not sure where I'd find detail::atomic_count.

It's in boost/detail/atomic_count.hpp.

...

Has Boost.Atomic been accepted? I don't see it in the 1.46 docs.

Not yet; once it's accepted it would probably be a better choice.

Chad Nelson

8 Mar 8 Mar

3:50 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Mon, 7 Mar 2011 19:22:44 +0200 "Peter Dimov" <pdimov@pdimov.com> wrote:

...

...
Not sure where I'd find detail::atomic_count.

It's in boost/detail/atomic_count.hpp.

Thanks.

...

...
Has Boost.Atomic been accepted? I don't see it in the 1.46 docs.

Not yet; once it's accepted it would probably be a better choice.

Probably so. We'll see how things turn out, testing may well prove that CoW isn't of any benefit. -- Chad Nelson Oak Circle Software, Inc. * * *

Ivan Le Lann

7 Mar 7 Mar

8:36 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Le lundi 07 mars 2011 à 04:37 +0200, Peter Dimov a écrit :

...

For the platform I'm currently working on, that would mean arithmetic full of spinlocks... Useless sarcasm aside, isn't that an abuse of atomics? Especially if you can stress the frontend to avoid them. Ivan

Peter Dimov

4:33 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Ivan Le Lann wrote:

...

Useless sarcasm aside, isn't that an abuse of atomics?

Reference counting an abuse of atomics? It's one of the primary use cases for them.

Chad Nelson

5:07 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Sun, 6 Mar 2011 17:43:15 -0800 "Simonson, Lucanus J" <lucanus.j.simonson@intel.com> wrote:

...

Sorry, but that isn't really a complaint about CoW. It's a complaint about pass-by-value, an admitted mistake (which I'm correcting now) which CoW makes cheaper.

...

I looked into it briefly yesterday (I've used the concept behind it before, but I didn't realize that that the name referred to it), and that's almost certainly the way to go in the long run. Apologies if I missed answering your message; I did see it, but some may have slipped through unanswered.

...

That requires writing at least a minimal memory manager. As I understand it, the managers that come with compiler libraries are optimized for that sort of thing these days, since C++ is geared toward so many small heap-allocated objects. I'd rather let the compiler's library deal with it than write more non-core code for something that the built-in code will almost certainly be able to do better most of the time.

...

I'm not familiar with a use of expression templates that would allow that. Can you point me to something explaining it? -- Chad Nelson Oak Circle Software, Inc. * * *

Jeffrey Lee Hellrung, Jr.

6:28 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On 3/6/2011 9:07 PM, Chad Nelson wrote:

...

I don't think that's what Luke's talking about. He's saying, if x has some value, say [x0, x1, x2], and you assign some expression to it, you want the result of that expression to directly replace the digits [x0, x1, x2], rather than allocating a new block, compute the result in this new block, and swap this block in place of the block holding [x0, x1, x2].

...

Expression templates would allow you to tell the compiler, when it sees x = a + b, to generate code that replaces x's digits with the sum of a's and b's digits, one at a time. Under ideal circumstances, if x already has a large enough block, no new memory needs to be allocated. With no expression templates, with or without COW or move emulation, a temporary result holding the sum of a and b would be created, necessitating an allocation, which would then (at best) be moved into x. google "c++ expression templates" and/or look at Boost.Proto and libraries that use Boost.Proto. - Jeff

Chad Nelson

7:29 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Sun, 06 Mar 2011 22:28:46 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

...
...
This whole argument misses the point of optimizing code that uses a bigint class. You want to reuse the memory allocated in a bigint variable over and over instead of reallocating every time.

That requires writing at least a minimal memory manager. [...]

I don't think that's what Luke's talking about. He's saying, if x has some value, say [x0, x1, x2], and you assign some expression to it, you want the result of that expression to directly replace the digits [x0, x1, x2], rather than allocating a new block, compute the result in this new block, and swap this block in place of the block holding [x0, x1, x2].

Okay, I don't see the mechanism, but more on that below.

...

...
...
Whether you make one unneeded allocation or two is a less important consideration than whether you make one or zero since 2/1 is a lot smaller than 1/0. If a temporary is generated by the user's code they have already pessimized their code. I am sorry, but you cannot avoid allocation with a return by value semantic, only expression templates allows you to reuse the storage on the left hand side of the assignment operator. [...]

I'm not familiar with a use of expression templates that would allow that. Can you point me to something explaining it?

Expression templates would allow you to tell the compiler, when it sees

x = a + b,

to generate code that replaces x's digits with the sum of a's and b's digits, one at a time. [...]

How would the expression's code know anything about x, or have access to its allocated memory? If the template is operator+, you'd have to pass x to it somehow. If it's not, operator+ would have to allocate some memory for a temporary, wouldn't it?

...

google "c++ expression templates" and/or look at Boost.Proto and libraries that use Boost.Proto.

I haven't looked at Boost.Proto yet, but I did look at a couple hits from Google on the subject in the last couple days. Maybe I'm missing it, but I didn't see anything that would allow that. -- Chad Nelson Oak Circle Software, Inc. * * *

Joel Falcou

7:52 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

...

x just have a operator= taking a interge_expression<X> type whcich contains the whole a+b abstract syntax tree. From there, there is countless techniques to iterate over the AST, transform it or evaluate it as code.

Chad Nelson

2:12 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Mon, 07 Mar 2011 08:52:23 +0100 Joel Falcou <joel.falcou@lri.fr> wrote:

...

Ahhhh... that's an application I'd never considered. I'll have to look into it. Thank you. -- Chad Nelson Oak Circle Software, Inc. * * *

Peter Dimov

5:18 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Joel Falcou wrote:

...

There surely are, but a large integer library doesn't seem like a trivial application. Even simple things like x = a*b + c; x += a*b*c; don't seem entirely straightforward when the goal is to eliminate all temporaries. So rewriting everything to use expression templates (properly) could be quite an amount of work.

Simonson, Lucanus J

6:08 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Peter Dimov wrote:

...

How about not so simple? x_num = (x11 * dy1 * dx2 - x21 * dy2 * dx1 + y21 * dx1 * dx2 - y11 * dx1 * dx2); x_den = (dy1 * dx2 - dy2 * dx1); x = x_num / x_den; I originally wrote it as one line. This is a real use case for big int from my library. The problem we are faced with is we can only eliminate allocation for one temporary in an expression by reusing the lhs of the assignment. A compound expression, like the ones above, requires several temporaries. It is sort of unreasonable to ask the user to rewrite their expressions as one line per operation to avoid temporaries. It defeats the purpose of operator syntax. I suggest that when the expression is evaluated at runtime storage for temporaries be allocated on the stack. You will want a threshold of say 4K on the size of the temporary in the stack (otherwise allocate on the heap). This optimization would be much easier to implement *if* the algorithms were generic and agnostic to whether the buffer holding the big int value were a member of some integer_x object. Now, taking things just a little futher, what if there were a point of customization for each algorithm so that the user could override the algorithms at compile time with their own implementation. That could give us the option to override with gmp's C interface at the low level but get our clever expression templates with stack allocation optimization on top instead of gmp's C++ interface which allocates for temporaries in compound expressions. (Assuming gmp's C interface would let us manage the memory ourselves, I haven't looked that closely at it since I use the C++ interface.) It is pretty important to me that a boost big int library be better than GMP in at least one way. Better expression templates and usage of C++ to define the C++ interface is the *only* area where a boost library can deliver on that, and that is exactly the area boost excels. For the algorithms it is fine to provide a default boost licensed implementation, but also provide hooks to override it with GMP or something else that is SSE accelerated and asymptotically optimal and actively maintained to be up to date with hardware changes. Also, it is just plain better design to decouple the interface from the implementation of the algorithms. Eventually we could standardize such an interface and let compiler authors implement the algorithms. For the gnu stl they could just plug in gmp. Standardization is a good reason not to use proto, since the expression template system would need to be rewritten or proto itself would need to become standardized. Personally, I don't see why this library should have any dependencies at all. Regards, Luke

Ryo IGARASHI

8:04 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Hi, I would like to comment only on the issue with Expression Template (ET). On Mon, Mar 7, 2011 at 4:29 PM, Chad Nelson <chad.thecomfychair@gmail.com> wrote:

...

As Xint is an numerical library and you are the author of it, you may be familiar with ET usage in Boost.uBLAS than Boost.Proto (the latter is very generic). Eliminating temporaries from multiple huge matrix addition and multiple "big integer" addition seems similar thing to me. uBLAS library also takes advantage of splitting interface over internal data structure (of matrix elements), since you can easily think of internal storage (dence, triangular, sparse, etc.), and many of them are implemented can be easily passed to external linear algebra library (BLAS, LAPACK, etc.) Best regards, -- Ryo IGARASHI, Ph.D. rigarash@gmail.com

Chad Nelson

2:19 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Mon, 7 Mar 2011 17:04:27 +0900 Ryo IGARASHI <rigarash@gmail.com> wrote:

...

Thanks. I haven't yet had a need to use Boost.uBLAS, so I'm no more familiar with it than Boost.Proto, but I'll certainly check both out. -- Chad Nelson Oak Circle Software, Inc. * * *

Simonson, Lucanus J

4:46 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Chad Nelson wrote:

...

On Sun, 6 Mar 2011 17:43:15 -0800 "Simonson, Lucanus J" <lucanus.j.simonson@intel.com> wrote:

...

The operator+, for example, returns an expression template object that holds references to all the arguments of the orignal function and type information that it was operator+. If any of the arguments are themselves expressions it holds references to their expression type, not their result. An overloaded operator= on your data type allows you to accept an expression type instead of the result of the expression. In the implementation of operator= you call a member function of the expression template and pass the left hand side of the assignment operator in by reference to use as the storage for the result. vector<long long> x1; vector<long long> y1; initialize_with_values(x1, 1000000); //size of x1 is 1000000 initialize_with_values(y1, 1000000); //size of y1 is 1000000 integer_x result; integer_x arg1; integer_x arg2; for(int i = 0; i < 1000000; ++i){ arg1 = x1[i]; //allocates only in the first iteration arg2 = y1[i]; //allocates only in the first iteration result = arg1 * arg2; //allocates only in the first iteration, reuses storage in result thereafter } I can do O(N) big int operations with only O(1) allocations using a nice type safe operator syntax with a well designed expression template system. Obviously, allocations would dominate the computation of 128 bit multiplication results, so this is a significant optimization. If someone is just doing O(1) big int operations then they don't care about performance. The common case isn't the one that is written the most times, it is the one that is executed the most times. Regards, Luke

Chad Nelson

8 Mar 8 Mar

3:48 a.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Mon, 7 Mar 2011 08:46:44 -0800 "Simonson, Lucanus J" <lucanus.j.simonson@intel.com> wrote:

...

Thanks. I think I see it now, and I'm playing with the concept to see whether it would be a good fit. Early comments from people who've used it suggest that it might not be. -- Chad Nelson Oak Circle Software, Inc. * * *

Simonson, Lucanus J

6:01 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Chad Nelson wrote:

...

Ok, who were those people and what did they say about expression templates? I know there were some people that were concerned about compile time. If that is the concern then we have a classic trade off between compile time and run time. When we were having the "how I code" thread recently I listed my priorities in relative order. Compile time didn't even make the list and runtime was #2. Different people have different priorities, but I have never heard of someone who valued compile time over run time. If someone values compile time over conciseness, readability and equational reasoning afforded by operator syntax then they might opt to abandon operators entirely to get fast code that passes storage by reference for the result of every operation into a three argument function. Such people would probably be satisfied with the low level interface you already plan, and won't pay for the operator syntax they don't use. Regards, Luke

Frank Mori Hess

6:22 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Tuesday, March 08, 2011, Simonson, Lucanus J wrote:

...

I have never heard of someone who valued compile time over run time.

You've never heard anyone promote an interpreted language before?

Simonson, Lucanus J

6:30 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Frank Mori Hess wrote:

...

My monitor is lucky I wasn't drinking coffee when I read that. I meant in C++. Obviously if someone wanted a slow big int they could use java or perl and be quite happy with their compile time.

Chad Nelson

9 Mar 9 Mar

3:04 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

On Tue, 8 Mar 2011 10:01:13 -0800 "Simonson, Lucanus J" <lucanus.j.simonson@intel.com> wrote:

...

Eric Niebler: "[...] Joel, I'm honored by your enthusiasm for Proto, but in this case, I think it's misplaced. For a simple value type like an infinite precision integer, expression templates are just plain dangerous." Peter Dimov: "[...] a large integer library doesn't seem like a trivial application. Even simple things [...] don't seem entirely straightforward when the goal is to eliminate all temporaries. So rewriting everything to use expression templates (properly) could be quite an amount of work." Steven Watanabe: "Expression templates are a powerfull tool. However, I'm not convinced that they're appropriate for this library. Although they allow some clever optimizations, the extra complexity can cause confusion. This is a fairly basic library which we can expect to be used by people who don't know the language well enough to cope with the problems easily."

...

[...] Different people have different priorities, but I have never heard of someone who valued compile time over run time. [...]

During the initial development of a program, when I have to compile every few minutes to test things, I certainly do. But the comments I was talking about, listed above, weren't focused on that aspect. -- Chad Nelson Oak Circle Software, Inc. * * *

Simonson, Lucanus J

6:04 p.m.

New subject: [xint] Boost.XInt formal review (concrete complaint)

Chad Nelson wrote:

...

Perhaps Eric can elaborate on this. Is the usage of expression templates by GMP dangerous? I already said Proto shouldn't be used for the expression templates and if asked my opinion on how to implement the expression template syntax I would steer you away from SFINAE or even tag dispatching based designs and toward function overloading because I'm not proposing a big int concept, just an expression template system that handles only the library provided big int type and expressions thereof. I have designed several expression template systems and I think I know what is dangerous and how to avoid it. I am more than willing to help you. I respect Eric's opinion and if he knows why a simple expression template system like the one used in valarry or GMP itself is a bad idea I'm willing to listen.

...

Yes, it would be quite an ammount of work. That however is a very poor rationale for not doing the work. You aren't on a schedule (as far as I know) and can take your time to get the interface right. You stand to learn a lot from the experience and I think it would be effort well spent. It would be less work than I did for Polygon's redesigned interface before I brought it for review. I worked on that library for three years before I brought it for review and rewrote the interfaces twice. I have already proposed a fairly simple way to eliminate all temporaries using expression templates without complicated template metaprogramming. No need to create a complex system of rearranging expressions, no trying to implement a compiler that runs at compile time on expressions, just alloca if less then 4k.

...

No one has ever complained that the expresion templates in Polygon make the library hard to use, and I have many users of different experience levels. I have never had a problem with GMP's expression templates. Let me restate that clearly, unlike problems that were easy for me to deal with, GMP's expression templates posed no problems for me at all. I'm no proposing that the user be given the ability to extend the expression template system, so there is no need for them to understand or even be aware of it. My Polygon users often have no idea how the library works, but are quite happy with the productive operator syntax (in many cases.) The lower the experience level, generally, the happier they are to ignore the implementation details of the library and just use it to get their job done. Regards, Luke

Jeffrey Lee Hellrung, Jr.

7 Mar 7 Mar

2:35 a.m.

On 3/6/2011 4:25 PM, Chad Nelson wrote:

...

The removal of the COW stuff would result in a huge interface change, as it implies a particular usage of xint::integer_t objects. I think whether you're willing to freeze the interface or not should have little bearing on acceptance, in this case. Given the work that still should be done (in my very subjective opinion), I don't think XInt should be accepted as a boost library, and this has the added benefit that you *don't* have to freeze the interface if and when you find better alternatives.

...

I see. But you understand that their presence as member functions makes one wonder about the asymmetry, right? I don't see why is_odd is any more fundamental than getbit; indeed, getbit would seem to be more fundamental than is_odd or is_even, no? [...]

...

Found it. [...]

...

I see now. It doesn't look like you're using the keyword and tag structure outlined here: http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/index.html#para... Could you explain why not? Using Boost.Parameter-style keywords and tags might obviate the need for most of the parameter deduction infrastructure you have at line 220 in options.hpp.

...

I only mentioned it as an option ;) I don't particularly prefer one or the other.

...

Are these localized to the examples, then, and hence not included by default?

...

Perhaps you should suggest adding strong_random_generator to Boost.Random library, then.

...

Okay, I understand. Do you basically just need to fill a range of digits with random numbers? Perhaps this is something that Boost.Random can make easier (although it isn't too hard at present).

...

Good point. For whatever reason, to me, is_prime just sticks out as a function that's more appropriate in an example, given its specific use, but consider others' comments as well. [...]

...

Okay, I may have misread or misremembered. Apologies. [...]

...

I guess I kind of expected a breaking change if you added a 2's-complement-based implementation, which would seem reasonable. I haven't looked in detail at the fixed integer stuff anyway, as I gathered it wasn't a focus of the library. Maybe that inferred to me that the interface was still in some amount of flux. Perhaps the fixed integer stuff *should* actually be separated into a different submodule altogether? I don't know, I seem to remember that's how you had it originally, but there were arguments in favor of merging everything into one integer_t class...?

...

Great example. Abs and negate can just return appropriate views layered on top of the object at *less* cost than COW. So, yes, it would be foolish to deep-copy if you're never going to change either the copy or the original, but deep-copying in the absence of COW is far from necessary. [...]

...

I obviously didn't read this closely enough, hence the surprise.

...

It's not thread-safe without atomic operations (maybe adds overhead in single-threaded environments?); it adds space and time overhead (probably negligible except on small integers); it complicates the implementation (subjective); as far as I've seen, it is *still* unnecessary, in the sense that it doesn't give any real performance benefits; and it encourages a style of programming incompatible with the standard containers (entirely subjective whether that's good or bad, I admit). Further, even if COW has some benefits over a non-COW implementation, it should probably be built as another layer on top of a non-COW implementation, as you can't go the other way around. COW shouldn't be used at the lowest implementation level. [...]

...

Simply put, it is a vacuous claim without further justification. - Jeff

Chad Nelson

7:13 a.m.

On Sun, 06 Mar 2011 18:35:47 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

...
XInt's existing interface could be frozen at its current state with no problem. Even the removal of the copy-on-write stuff would only result in one function having an unused Boolean parameter. Any additional functionality can easily be added in a way that doesn't break existing client code.

The removal of the COW stuff would result in a huge interface change, as it implies a particular usage of xint::integer_t objects. [...]

It might be a change to how people use it, but that isn't a change to the interface. Code that relies on the copy-on-write behavior may not work as fast without it, but it *would* still compile and run with no problem. That's an acceptable cost, in my opinion.

...

...
...
- is_odd, is_even, sign, hex_digits, is_nan: These are all member functions with no same-named free function...which makes me wonder why these were chosen to be member functions while the likes of getbit, setbit, is_prime, etc. were chosen to be strictly free functions.

Because n1692, the paper presented to the C++ Standards Committee in 2004 that I based XInt's interface on, has them there. I saw no point in cluttering the interface by adding free functions that just call those, although if that would satisfy this objection, it can easily be done.

I see. But you understand that their presence as member functions makes one wonder about the asymmetry, right? I don't see why is_odd is any more fundamental than getbit; indeed, getbit would seem to be more fundamental than is_odd or is_even, no?

It's only fundamental because that paper specified it. I'm not particularly attached to that decision; my only purpose in using that interface was to have drop-in compatibility with any code that might have already been written to use it. I have yet to hear of any such code.

...

...
...
In any case, as far as I know (that is, Chad hasn't given me a specific example), XInt doesn't use the parameter deduction feature of Boost.Parameter,

integer_t_data should clear that up. If I understood your explanation of what you meant, I don't think the class could accept an allocator in any arbitrary position without it, though I could well be wrong.

I see now. It doesn't look like you're using the keyword and tag structure outlined here:

http://www.boost.org/doc/libs/1_46_0/libs/parameter/doc/html/index.html#para...

Could you explain why not?

As I recall, I tried that. I don't remember why I decided to change it, just that there was a compelling reason. I believe it had something to do with the fact that most of the parameters are simple Booleans -- their mere existence conveys all the information they need, forcing someone to specify true or false as well seemed pretty inelegant when there was an easy way to avoid it.

...

Using Boost.Parameter-style keywords and tags might obviate the need for most of the parameter deduction infrastructure you have at line 220 in options.hpp.

It might, but why would I want to? That's a powerful feature, and requiring a slight change to a few pieces of code that might use a different BOOST_PARAMETER_MAX_ARITY setting is a small price to pay for it.

...

...
...
The random number stuff and the cryptographic-specific stuff should not be included within XInt. These should be left to other libraries (Boost.Random in the former case).

It is. Of the two random-number classes in the code, one is just a wrapper around a fast Boost.Random class that provides a default value for convenience, the other implements the same abilities as Boost.Random's random_device class in a form that's slightly easier to work with in the examples (Boost.Random couldn't handle that at all under Windows when I originally wrote it). Client code must specify a random number generator for the few (two, I think) functions that can use one, and they work equally well with Boost.Random's classes.

Are these localized to the examples, then, and hence not included by default?

It can and probably should be. It isn't at present, because they used to be needed as defaults and I hadn't though to remove them yet. I'll put it on the to-do list.

...

...
I'm not wedded to the default_random_generator. It's no longer needed by any code in the library, and could be deleted without any other changes. But the strong_random_generator is significantly more convenient for the examples, and somewhat more convenient for casual client code use, as it doesn't require a compiled library to use as Boost.Random's does.

Perhaps you should suggest adding strong_random_generator to Boost.Random library, then.

It already has random_device for that purpose. I understand why Steve Watanabe chose to implement it in a compiled library instead of the header-only one, but it makes it harder to use it in another library's example code.

...

...
The random number functions in the library are there to make it easier to generate large random numbers, which of course Boost.Random can't natively support. That's a legitimate mathematical purpose, one that would have to be reimplemented by a significant number of programs using the library if it were removed.

Okay, I understand. Do you basically just need to fill a range of digits with random numbers?

That, setting the highest requested bit (often desired to get the exact size the developer wants), and clearing any bits above that, are the most important parts. Fiddly bit twiddling that you or I might be comfortable with, but most casual developers aren't.

...

Perhaps this is something that Boost.Random can make easier (although it isn't too hard at present).

It could possibly be modified to help with the first step, not really with the rest unless Steve wants to add XInt-specific functions.

...

...
...
The fixed-size integers leave something to be desired, but as Chad himself has emphasized, they are second-class citizens. So-called second-class citizens probably should not included within the library, and if they are, not without huge warnings saying something like "EXPERIMENTAL" or "SUBJECT TO CHANGE". [...]

Eh? Only the internals of them are subject to change. The current interface can be preserved without any loss of functionality, and added onto without breaking anything.

I guess I kind of expected a breaking change if you added a 2's-complement-based implementation, which would seem reasonable. [...]

As I envision it, two's-complement would be an added option. The interface wouldn't need to change at all for existing code.

...

Perhaps the fixed integer stuff *should* actually be separated into a different submodule altogether? I don't know, I seem to remember that's how you had it originally, but there were arguments in favor of merging everything into one integer_t class...?

Exactly.

...

...
...
Chad already knows my feelings on COW, but to be explicit: I'm not necessarily against use of COW in general, but I have yet to be given a use case where COW would outperform move emulation in theory.

I've located two: abs and negate. I believe those were the functions that originally prompted me to use it. I'm sorry that I didn't mention them before; I knew I had a reason for going to all the trouble of using it, but I didn't recall what that reason was until I saw those while working with the code today.

Think about it. Both involve changes to the sign bit only, and one only sometimes needs to alter it. It would be foolish to deep-copy a ten-thousand-bit or larger number just to flip the sign bit. Even worse, not to do anything to it at all, in the case of abs and already-positive numbers.

Great example. Abs and negate can just return appropriate views layered on top of the object at *less* cost than COW. [...]

If you go back and look at some of the earlier versions (I think you can get them through the sandbox, otherwise I can provide them from my own repository), you'll see that I've tried that too. I don't remember why I abandoned the idea, but I do recall that the results weren't pretty, which would certainly have contributed to the decision.

...

...
The only concrete complaint that people could give me about CoW was that code using it couldn't be made thread-safe, which is true. So I made a way to instruct the library to force all integer objects to have unique storage before they're returned to user code, and made it the default to prevent unpleasant client-code surprises. The internal code is single-threaded, and completely under my control in any case if I find it useful to make it multithreaded someday, so that answered the thread-safety complaint.

The only other complaints people had was that they didn't like it, which is not a valid reason to change it, and that move-emulation makes it irrelevant, which it doesn't in this case.

It's not thread-safe without atomic operations (maybe adds overhead in single-threaded environments?);

If you can show me one place where the existing design isn't thread-safe when used with the threadsafe option, I'll eat my keyboard, without salt. I put a lot of careful thought into it, and barring an implementation bug like me forgetting to call _make_unique in a function I should, I'm willing to swear that it's bulletproof.

...

it adds space and time overhead (probably negligible except on small integers);

True enough.

...

it complicates the implementation (subjective);

True, but so long as I (or other developers hacking around on XInt, rather than just using it) are the only ones dealing with that, I don't think that should matter.

...

as far as I've seen, it is *still* unnecessary, in the sense that it doesn't give any real performance benefits;

That's still undecided.

...

and it encourages a style of programming incompatible with the standard containers (entirely subjective whether that's good or bad, I admit).

If you count std::string as a container, then it's not incompatible with all implementations of them. :-)

...

Further, even if COW has some benefits over a non-COW implementation, it should probably be built as another layer on top of a non-COW implementation, as you can't go the other way around. COW shouldn't be used at the lowest implementation level.

Possibly. If CoW survives the tests after I've finished updating the code, we can debate that.

...

...
...
The "fast" claim on the primary documentation page is unsubstantiated, and, according to Chad, somewhat unsubstantiatable (assuming that's a word). I don't think any claims about performance should be made *at all* without some kind of benchmark or statistics to justify it. Any library will be fast on a sufficiently fast (perhaps idealized) computer.

As I've said before, I'm not claiming that it's "faster" or "fastest," which are legitimately questionable claims. I'm saying that it's fast, as a way to conversationally lead in to the explanation that portability is more important, but reassure the reader that speed is still an important design criteria.

Simply put, it is a vacuous claim without further justification.

I've just realized why I keep butting heads with people here over that line. Most of you guys probably work for companies that are big enough that you never have to step outside of coding, or you're employed in another field and program as a hobby. But for many years now I've worked for a (very) small software company. We've never had more than three people on the payroll at once, and most of the odd jobs fall to me -- including marketing. It didn't come naturally to me, I was dragged into learning it kicking and screaming and I don't claim to be particularly good at it, but I've done it long enough that it's second nature now. Before you say that marketing doesn't belong in the documentation for a Boost library, step back and think about it. When a developer needs a library, what does he do? If he's like most, he has three steps: * He compiles a list of the libraries that sound like they might do what he's looking for. * If there are more than a handful, he then glances at each one quickly and throw out the ones that obviously aren't suitable. (This might be combined with the first step.) * He settles down to seriously evaluate the handful that survived that winnowing. It's essentially the same as hiring people for a job. If you're a business owner and you advertise a position, you collect resumes. If you get more than a handful, you quickly glance at each looking for excuses to throw them away -- this guy doesn't know Java, that one claims BASIC as his sole development qualification, etcetera -- getting ever pickier, until you're down to a manageable level. Then you interview those that survived. Candidates will stand or fall on their own merits in the third step, the trick is surviving to get that far. That whole introductory section is aimed at doing just that. The keywords in it will catch the attention of search engines, which will then be more likely to mention it to people searching for such libraries. The first thing they'll do is glance through the index page of the documentation, which will immediately show them what the library is intended for, and hopefully put it on their list. During the winnowing, it tells them everything they need at a glance; if it's not suitable, they can quickly drop it, but if it is, it gives them enough information to keep it in the running. The rest of the documentation is for the third step. My goals are to make the library both portable and fast, with portability taking precedence where there's a conflict. That specific line is a non-technical way to convey that goal to my target audience, developers looking for a quick and easy way to do large-integer calculations in a C++ program. Mostly casual developers, because that's the vast majority of developers out there, but also professionals like myself who need it for commercial projects. Call it fluff if you'd like, but as much as we might wish otherwise, marketing is necessary. Especially in a field like XInt's, where there are tons of low-quality hobby libraries and only a few free, high-quality ones. I want it to be used, and with that section it has a fighting chance. -- Chad Nelson Oak Circle Software, Inc. * * *

Joachim Faulhaber

8 Mar 8 Mar

9:57 p.m.

2011/3/7 Chad Nelson <chad.thecomfychair@gmail.com>:

...

Nope, I just looked at the paper, they are free standing functions there. And they are named without the is_ prefix: const int sign( const integer & ); const bool even( const integer & ); const bool odd( const integer & ); const bool getbit( const integer &, unsigned int ); ... BTW there is *no* nan and is_nan in this paper. Please be more precise. Regards, Joachim -- Interval Container Library [Boost.Icl] http://www.joachim-faulhaber.de

Chad Nelson

9 Mar 9 Mar

3:22 p.m.

On Tue, 8 Mar 2011 22:57:19 +0100 Joachim Faulhaber <afojgo@googlemail.com> wrote:

...

<sigh> My statement was true when I originally wrote the code. I didn't realize that the later changes took it so far away from the original design. Thanks for the correction. -- Chad Nelson Oak Circle Software, Inc. * * *

Joachim Faulhaber

4:18 p.m.

2011/3/9 Chad Nelson <chad.thecomfychair@gmail.com>:

...

you are rehabilitated ;-) There are 3 papers N1692, N2020, N2143 on the topic and in the *latest* paper http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2143.pdf the functions is_odd(), is_even(), sign() are member functions just as in your implementation! So you adapted to the upcoming standard here. It would be better for you to add a reference to the latest paper to your docs and refer to it next time :) Cheers, Joachim -- Interval Container Library [Boost.Icl] http://www.joachim-faulhaber.de

Chad Nelson

4:44 p.m.

On Wed, 9 Mar 2011 17:18:10 +0100 Joachim Faulhaber <afojgo@googlemail.com> wrote:

...

That thump that you just heard was the sound of my jaw hitting the ground. I didn't know about the later papers, so I can't claim any credit for that. (I can't download either of the new-to-me ones at the moment either, apparently due to a server glitch.) If it weren't for the date on the page that lists them, I might think that Mr. Kronenburg took inspiration from XInt! :-)

...

It would be better for you to add a reference to the latest paper to your docs and refer to it next time :)

I've already added it to the docs. Thanks for letting me know about them. -- Chad Nelson Oak Circle Software, Inc. * * *

Joachim Faulhaber

5:02 p.m.

2011/3/9 Chad Nelson <chad.thecomfychair@gmail.com>:

...

I would really like to confirm that, but mail to M. Kronenburg is not delivered to the mail address attached to his papers. Does someone know a working e-mail address of him? Thanks, Joachim -- Interval Container Library [Boost.Icl] http://www.joachim-faulhaber.de

Steven Watanabe

8:40 p.m.

AMDG On 03/06/2011 11:13 PM, Chad Nelson wrote:

...

I don't want to add anything specific to XInt to Boost.Random, although I'm willing to adapt generic components (in particular, uniform_int_distribution to work with XInt.) In Christ, Steven Watanabe

Jeffrey Lee Hellrung, Jr.

7 Mar 7 Mar

6:17 a.m.

On 3/6/2011 4:25 PM, Chad Nelson wrote: [...]

...

Another concern with COW in general is iterator invalidation. Either your iterator gets "fatter" and slower (e.g., by being implemented as a pointer to the object + offset), or you expand the class of operations that invalidate iterators beyond just resizing operations to *any* modification operation (compared to a class which does not use COW). I don't know if this will be significant concern if (or once) xint::integer_t gets some kind of proxy objects into its internals (e.g., iterators), but...all things being equal, I prefer "slim" iterators and fewer iterator invalidation operations. Also, some of us have probably been "poisoned" by Herb Sutter: http://www.gotw.ca/publications/optimizations.htm If I'm reading the summary of his test harness correctly, fully 1/3 of the copies were never modified, which one might expect to give COW a huge advantage. Turns out, it only gave a 5 - 10% speed up over non-COW, and the thread-safe COW (all 3 implementations) netted a performance penalty. Besides, even if turns out that COW gives some kind of benefit in some use cases, I believe it should be built on top of a non-COW base integer class. You can always add a COW layer on top of a base class, but you can't remove it. - Jeff

Chad Nelson

7:40 a.m.

On Sun, 06 Mar 2011 22:17:55 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

The iterator would have to hold a reference to the CoW object, to keep it from being changed or deallocated while the iterator is supposed to be valid, but other than that, the implementation that I see in my head wouldn't need to be any different, or be invalidated any more than without. Holding that reference ensures that nothing else will attempt to change the magnitude in any way. And if it's a non-const iterator, it would point to a unique instance anyway, ensured by whatever function provided the iterator.

...

A 5 - 10% speed improvement is nothing to sneeze at. And I'm not using thread-safe CoW, my design model doesn't need it.

...

A distinct possibility, which we can debate if testing proves that CoW is still beneficial. -- Chad Nelson Oak Circle Software, Inc. * * *

DE

10:31 a.m.

on 07.03.2011 at 9:17 Jeffrey Lee Hellrung, Jr. wrote :

...

been the cow atomically ref counted you will NEVER notice it's there cow is entirely internal detail which is not shown through the interface (not counting the ensure_detached feature) and iterator interface too think of it: if an operation requires reallocation of the storage then anyway iterators are invalidated OTOH if an operation does not need a reallocation a cow object is left referring to the same data and iterators are still left valid moreover cow allows a plain user treat the objects of such a class as plain values and don't bother if making some unmodified copies is somewhat expensive (which is actually relatively cheap with cow) this is VERY convenient and allows to concentrate on the e.g. algorithm and not on that subtle details of whether to make a reference to an object or just copy it -- Pavel P.S. if you notice a grammar mistake or weird phrasing in my message please point it out

Peter Dimov

4:37 p.m.

Jeffrey Lee Hellrung, Jr. wrote:

...

Another concern with COW in general is iterator invalidation.

This is the only concern. The moment you allow non-const iterators into your potentially shared representation, you have to think very carefully about specifying when these are invalidated (the correct answer is "on any operation"). std::string has this problem; since CoW is deemed an implementation detail there, obtaining a non-const iterator or reference into a string disables sharing for this string forever (because you can't track the lifetime of the iterator or reference).

Stewart, Robert

2:08 p.m.

Chad Nelson wrote:

...

That hasn't been accepted into the standard, so it can be improved. You might be the one to influence its improvement.

...

I'd rather algorithms be separate. Even if you find that an efficiency problem, it should merely be seen as a catalyst to expand the integer_t interface to regain the lost efficiency.

...

You can do the same, I should think, without COW. All you need is to make your algorithms less naive. The algorithms can manage the sign in a separate variable and refer to an existing magnitude or make a copy, as needed, unless I'm much mistaken. That is a little less convenient, but should obviate COW. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

4:21 p.m.

On Mon, 7 Mar 2011 09:08:33 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote: [...]

...

That's three votes in favor of changing it, counting my own altered opinion. I'll put it on the list.

...

Certainly it could be done without CoW. In an earlier iteration, I had a special subclass for that specific purpose, which kept a reference to an existing integer and a separate negate-the-sign Boolean. But if you work out the implementation, you'll see that it's essentially a poor-man's version of CoW anyway. Arguably less efficient, and definitely less aesthetically appealing, so I replaced it with a true CoW implementation. -- Chad Nelson Oak Circle Software, Inc. * * *

Stewart, Robert

4:44 p.m.

Chad Nelson wrote:

...

The difference, as I see it, is that what I suggested is only used when it is needed and there is no reference counting involved. With COW, you force the reference counting on all internals code. There may be something about the algorithms I don't know, particularly since I haven't spent any time studying their code, but I think this will obviate the need for COW in the internals. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Chad Nelson

8 Mar 8 Mar

3:45 a.m.

On Mon, 7 Mar 2011 11:44:21 -0500 "Stewart, Robert" <Robert.Stewart@sig.com> wrote:

...

I still haven't proven that CoW will remain beneficial once I've made some of the suggested changes. If it is, I'm not sure that removing it would be a good idea. If not, I'll likely dig up that earlier subclass. -- Chad Nelson Oak Circle Software, Inc. * * *

Scott McMurray

2:39 a.m.

On Sun, Mar 6, 2011 at 16:25, Chad Nelson <chad.thecomfychair@gmail.com> wrote:

...

"Guideline: Where possible, prefer writing functions as nonmember nonfriends." ~ http://www.gotw.ca/gotw/084.htm Another way of looking at it: The more (interesting) functions you can implement as nonmember nonfriends, the harder it is for people to claim that they can't implement their own pet function with the interface you provided.

Artyom

7 Mar 7 Mar

9:06 p.m.

Hello All, There is my formal review of Boost.Xint. DISCLAIMER -------------------------------------------------------------------- Char Nelson is the review manager of my Boost.Locale library that is going to be reviewed in April. So my judgment may be affected by this fact as I see Char Nelson exactly at same position that I would be very soon. --------------------------------------------------------------------

...

Please explicitly state in your review whether the library should be accepted

YES. However, I expect that algorithms performance would be improved ASAP.

...

- What is your evaluation of the design?

Pros: ----- Even I do not really like heavily "templatized" libraries I like the overall interface design. Simple and quite straight forward. After actually looking to the library deeply I think that having COW and (default) Threadsafe options is brilliant idea. It gives **very good** control over what you actually need allowing you to use cow integers where performance is critical and normal outside. I found it very convenient, configurable and powerful. All important operations are implemented, quite intuitive interface (after reading examples... documentation notes would be added) Cons/Problems: -------------- Points I think that are wrong: - Move Semantics: It is not clear, and not documented! boost::xint::integer a,b; a = 100; b = boost::move(a); // expect like swap(a,b) acts like a=b std::cout << a<<" " <<b << std::endl; // expect 0 100, get 100, 100 integer is container and should behave like container in move context and like like a primitive type - mostly for optimization. - Copy between different types (bug of feautre?): integer<opt::secure> a; // works integer<opt::threadsafe> b; a=b; // works integer<opt::threadsafe> b(a); // Does not! integer<opt::threadsafe> b = a; It seems that copy constructor explicit without any reason.

...

- What is your evaluation of the implementation?

I can't judge the code as I do not so familiar with such complicated meta-programming but I have following notes: Style: ------ 1. The coding is too dense. Complex code requires much more comments especially when it is not trivial. 2. There are lots of template classes and subclasses and other relations. I think internals should be documented otherwise in a year or two nobody would understand what is going on here. So good white paper about internals should be given. Random number generation under Linux ------------------------------------- It is just a bug... I had this one in my code once so, not a negative point but just something small to fix: Random Generator Under Linux/Unix rng("/dev/urandom", std::ios::binary) - By default lots of bytes are going to be read due to buffering - very inneficient Reading 32 bytes would actually read about 4k due to buffering it has huge impact on performance. - Set empty buffer to stream so it would not do caching - Use open(2), read(2) calls - No need ios::binary (but does not hurt) - under Unix all streams are same. Performance: ------------ This is the weakest point of this library, the performance of too many algorithms is very low, I compared it with GMP/gmpxx interface and found following Run Time ----------------------------------------------- op xint gmpxx x? ---------------------------------------------- * 13.6 us 1.4 us *9.7 / 52.8 us 1.6 us *33 + 0.9 us 0.51 us *1.76 powmod 165 ms 4 ms *41 sqrt 520 us 2.3 us *226 I can accept a difference of x2-x5 because GMP is havily optimized (and written in C and I had seen some code that C compiler optimizes much better then C++). Some algorithms should be rewritten with the best known solutions. Nobody expects them to be as fast as hadly optimized tools written in assembly but they should be comparable for big O notations, otherwise users would prefer to wrap GMP with something else and use it. Finally with all COW/Threasafty/Atomics the actual algoriths were "forgotten".

...

- What is your evaluation of the documentation?

Generally it was fine but not good enough. Documentation mostly missed following parts: 1. It was not clear how to use options. Clear examples are required for each significant option. 2. It is not always clear what are the default options. 3. When dealing with cryptography, use of so-called secure memory is required, thus there must be a very clear example on how to use allocators with xint class. It is very unclear what should be done. I've got an answer by direct asking, but I think documentation requires explicit example of definition of new allocator with xint. For example based on: VirtualAlloc, and VirtualLock API.

...

- What is your evaluation of the potential usefulness of the library?

Very, very useful.

...

- Did you try to use the library? With what compiler? Did you have any problems?

I had written several small benchmarks against GMP, tryed some small examples with gcc-4.3 and 4.5. Hadn't seen any specific issues.

...

- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?

I've spend several hours reading docs, interfaces and trying some code. I've read parts I can actually understand as this level of template meta-programming is quite too high for me.

...

- Are you knowledgeable about the problem domain?

I have some basic knowledge about cryptography familiar with cryptographic algorithms and tools I don't have deep knowledge in big number arithmetics. Additional notes and information: --------------------------------- I compared it with GNU GMP gmpxx interface: - xint is much easier to use - xint is much easier to configure in security terms. - xint has much more convenient interaces with gmpxx when talking about more comlex tasks. - xint is has significantly lower performance in too many critical algorithms like sqrt and powmod - this should be fixed and difference goes to difference of order of magnitude. Bottom Line and Vote: --------------------- YES, it should be included in boost. However I have a request (almost condition): -------------------------------------------- First problem that should be solved are actually algorithms complexities, I don't think it is something too hard. Reading few books and papers would guide a strightforward direction to the solution. Nobody cares about how beautiful template metaprogramming is if simple wrapper of GMP outperforms it by order of magnitude! It feels for me that most of the "Buzz" around Xint had deal with various request from 1001 users that each one thinks that it can be done better this way. Finally it was good and created very good interface but, algorithms were forgotten and when they slow "To COW or not to COW" does not really matter Improve them! This is the most important part! Why do I vote "Yes" even I think it is slow: a) Interfaces are very good and useful, no more changes are really needed from user point of view. b) The problem is solvable and is solely xint internal, so it should not prevent merging it into boost tree c) I had seen much more critical issues in live boost libraries, so I think this issue is not showstopper for inclusion. (Anyway nobody reviews acutal algorithms implementations!) Additional Recommendation: -------------------------- - Change move semantics to same as containers have. - Documentation should be improved with much more examples. Artyom Beilis.

Christopher Jefferson

9:18 p.m.

On 7 Mar 2011, at 21:06, Artyom wrote:

...

While it will take a while for people to get used to move semantics, in general you shouldn't assume anything about a moved-from value. In particular if a "small-value" optimisation is added, I would expect to see the value sometimes change, and sometimes not, for both xint::integer and std::string. Chris

Scott McMurray

8 Mar 8 Mar

1:53 a.m.

On Mon, Mar 7, 2011 at 13:06, Artyom <artyomtnk@yahoo.com> wrote:

...

I think that fact should be hidden as much as possible. What do you get from int a = 100; int b = boost::move(a);? I'd assume 100, 100, so that result from xint is a feature, not a bug. (Though obviously relying on that fact would be a terrible idea.) ~ Scott

Artyom

5:55 a.m.

----- Original Message ----

...

The point that same happens even for big integers, so I expect that move would prevent needless memory allocation. Finally the performance matters, so such optimizations are important. Artyom

Scott McMurray

8:26 p.m.

On Mon, Mar 7, 2011 at 21:55, Artyom <artyomtnk@yahoo.com> wrote:

...

Actually, it's because performance matters that I'd expect it for 100. For such a small number (below 30 bits) I'd expect SBO, in which case the 100 100 still makes sense. I agree that for pow(100, 100), it shouldn't be 100...00 100...00, though :)

Chad Nelson

4:17 p.m.

On Mon, 7 Mar 2011 13:06:10 -0800 (PST) Artyom <artyomtnk@yahoo.com> wrote:

...

There is my formal review of Boost.Xint.

Thank you for the review, and the comments.

...

...
Please explicitly state in your review whether the library should be accepted

YES.

However, I expect that algorithms performance would be improved ASAP.

Definitely. I've already made some changes that should help, for the next release.

...

Cons/Problems: -------------- Points I think that are wrong:

- Move Semantics:

It is not clear, and not documented!

That's because it's presently not used. The code is all there for it, but it's disabled because the future of Boost.Move was still up in the air when I wrote it. To turn it on, define BOOST_XINT_USE_MOVE before including any XInt headers. It will be on by default as soon as Boost.Move is in the Boost distribution, which I expect will be before XInt is ready to be added.

...

boost::xint::integer a,b; a = 100; b = boost::move(a); // expect like swap(a,b) acts like a=b std::cout << a<<" " <<b << std::endl; // expect 0 100, get 100, 100

integer is container and should behave like container in move context and like like a primitive type - mostly for optimization.

I haven't tried it, but that code should work the way you expect with the move code enabled.

...

- Copy between different types (bug of feautre?):

integer<opt::secure> a;

// works integer<opt::threadsafe> b; a=b;

// works integer<opt::threadsafe> b(a);

// Does not! integer<opt::threadsafe> b = a;

It seems that copy constructor explicit without any reason.

The reason was that different integer types may well have different behaviors -- the behavior of the library when assigning a negative value, for example, differs depending on the options specified. Making that constructor explicit makes it harder for users to get blindsided by those differences.

...

...
- What is your evaluation of the implementation?

I can't judge the code as I do not so familiar with such complicated meta-programming but I have following notes:

Style: ------

1. The coding is too dense. Complex code requires much more comments especially when it is not trivial.

Entirely possible. I try to use self-documenting code wherever I can, but sometimes I forget that what's obvious to me won't necessarily be obvious to others.

...

2. There are lots of template classes and subclasses and other relations. I think internals should be documented otherwise in a year or two nobody would understand what is going on here.

So good white paper about internals should be given.

Once the internals are pretty much settled, that can be done.

...

Random number generation under Linux -------------------------------------

It is just a bug... I had this one in my code once so, not a negative point but just something small to fix: [...] By default lots of bytes are going to be read due to buffering - very inneficient

Sheesh... I'd heard about that problem too, and I still forgot it when I wrote that code. Thanks for pointing it out.

...

Performance: ------------

This is the weakest point of this library, the performance of too many algorithms is very low [...]

A lot of that may be due to pass-by-value instead of const references, rather than the algorithms, which I've already corrected for the next release. I'll be testing that soon, maybe today, along with the CoW/move speeds.

...

Some algorithms should be rewritten with the best known solutions. [...]

Already on the list.

...

Finally with all COW/Threasafty/Atomics the actual algoriths were "forgotten".

Not entirely. Jeffrey Hellrung pointed out that the square-root algorithm I'm using is suboptimal, and John Bytheway introduced me to a new primality algorithm that I plan to add. But the algorithms used are really secondary to the design. Algorithms can be improved internally.

...

Bottom Line and Vote: ---------------------

YES, it should be included in boost.

However I have a request (almost condition): --------------------------------------------

First problem that should be solved are actually algorithms complexities, I don't think it is something too hard. Reading few books and papers would guide a strightforward direction to the solution. [...]

I'll report on the speed changes from the alterations I've already made as soon as I have them. I don't want to check any code into Subversion until the review is over (the reviews should all be of the same code), but I can make the updated code available earlier if anyone wants it.

...

Additional Recommendation: --------------------------

[...] - Documentation should be improved with much more examples.

I'll see what I can come up with. Thank you again. -- Chad Nelson Oak Circle Software, Inc. * * *

Artyom

8:13 p.m.

...

...
- Move Semantics:

It is not clear, and not documented!

That's because it's presently not used. The code is all there for it, but it's disabled because the future of Boost.Move was still up in the air when I wrote it.

To turn it on, define BOOST_XINT_USE_MOVE before including any XInt headers. [...]

...
boost::xint::integer a,b; a = 100; b = boost::move(a); // expect like swap(a,b) acts like a=b std::cout << a<<" " <<b << std::endl; // expect 0 100, get 100, 100

I haven't tried it, but that code should work the way you expect with the move code enabled.

Yes adding this define makes it work... So it would be good to have it documented :-) And this is really nice feature. Seems that integer is capable of handling all possible styles: copy, copy-on-write, move. Way to go! +1

...

...
This is the weakest point of this library, the performance of too many algorithms is very low [...]

A lot of that may be due to pass-by-value instead of const references, rather than the algorithms, which I've already corrected for the next release. I'll be testing that soon, maybe today, along with the CoW/move speeds.

Not so sure because I've tried with cow turned on and off and results were almost the same. AFAIK xint uses "naive" multiplication algorithm and probably some others. For me it seems like O(f(n)) issues. Because the ratio between GMP and Xint gets lower as numbers grow. I'd suggest to take a look on algorithms that GMP and other use and re-implement them.

...

John Bytheway introduced me to a new primality algorithm that I plan to add.

If you are talking about deterministic primality check (ASK)... Don't bother too much, nobody uses them in reality and they have O(scary) complexity. It is something like O(n^6) where n is number of digits... So don't bother. It is the last thing that users would actually need.

...

...
First problem that should be solved are actually algorithms complexities, I don't think it is something too hard. Reading few books and papers would guide a strightforward direction to the solution. [...]

I'll report on the speed changes from the alterations I've already made as soon as I have them. I don't want to check any code into Subversion until the review is over (the reviews should all be of the same code), but I can make the updated code available earlier if anyone wants it.

I'd suggest you to create a small set of benchmarks of xint vs GMP so you can see progress. Regards Artyom

Chad Nelson

9 Mar 9 Mar

3:19 p.m.

On Tue, 8 Mar 2011 12:13:22 -0800 (PST) Artyom <artyomtnk@yahoo.com> wrote:

...

...
...
- Move Semantics:

It is not clear, and not documented!

That's because it's presently not used. The code is all there for it, but it's disabled because the future of Boost.Move was still up in the air when I wrote it.

To turn it on, define BOOST_XINT_USE_MOVE before including any XInt headers. [...] I haven't tried it, but that code should work the way you expect with the move code enabled.

Yes adding this define makes it work... So it would be good to have it documented :-)

Probably, but it should be a moot point. Boost.Move should make it into the official distribution before XInt, so I'll turn that on permanently.

...

And this is really nice feature. Seems that integer is capable of handling all possible styles: copy, copy-on-write, move.

Way to go!

:-)

...

...
...
This is the weakest point of this library, the performance of too many algorithms is very low [...]

A lot of that may be due to pass-by-value instead of const references, rather than the algorithms, which I've already corrected for the next release. I'll be testing that soon, maybe today, along with the CoW/move speeds.

Not so sure because I've tried with cow turned on and off and results were almost the same.

Either way, the test results should definitively answer the question, once I can get to them.

...

AFAIK xint uses "naive" multiplication algorithm and probably some others.

Yes, that's the first algorithm I'll be looking at, once I've finished the more urgent stuff.

...

[...] I'd suggest to take a look on algorithms that GMP and other use and re-implement them.

I'm leery of looking at any GMP code... it's probably pure paranoia, the GPL can't apply to just *looking* at code, but I'd rather stick to descriptions to ensure that my code doesn't resemble anyone else's and nobody can claim that I've copied from them.

...

...
John Bytheway introduced me to a new primality algorithm that I plan to add.

If you are talking about deterministic primality check (ASK)... Don't bother too much, nobody uses them in reality and they have O(scary) complexity. It is something like O(n^6) where n is number of digits...

So don't bother. It is the last thing that users would actually need.

Fortunately, it's also the last and lowest thing on my wish-list. ;-) But I'm curious about the algorithm, and implementing it (someday) is the best way I know of to understand it better. -- Chad Nelson Oak Circle Software, Inc. * * *

Scott McMurray

6:10 p.m.

On Wed, Mar 9, 2011 at 07:19, Chad Nelson <chad.thecomfychair@gmail.com> wrote:

...

Your paranoia may be justified: "The courts have made it clear that, under copyright law, proof of substantial similarity between your work and another work, along with proof of access to the other work, may be enough to prove infringement, even when you don’t realize that you’re making a copy." ~ <http://rosenlaw.com/lj8.htm>

Chad Nelson

10 Mar 10 Mar

1:26 a.m.

On Wed, 9 Mar 2011 10:10:25 -0800 Scott McMurray <me22.ca+boost@gmail.com> wrote:

...

Sheesh... that's even worse than I thought. :-( In that case, I think it best if I don't even download GMP. -- Chad Nelson Oak Circle Software, Inc. * * *

Peter Dimov

2:24 a.m.

Chad Nelson wrote:

...

I don't think it's that bad. Algorithms aren't subject to copyright, and if two pieces of code implement the same algorithm, they will necessarily display a similarity. This similarity doesn't count as infringement. They either have to share substantial copyrightable elements that are not intrinsic to the algorithm, or there has to be some other proof that one is a direct copy of the other. The "mental copying is infringement" precedents have likely been established by companies suing ex-workers, and I suspect that these ex-workers were probably unable to afford a proper defense, because it's much more likely for someone to remember an algorithm than it is to memorize lengthy code. Either way, I've never heard of a GPL copyright holder pursuing "mental copiers". Not that I'm a lawyer, or something. :-)

Artyom

6:52 a.m.

...

OMG? Free software is about learning from each other. Collaboration not about suing each other. Now... It is legitimate action to look to other code. Use ideas. It is not ok to copy the code, but nothing prevents you reimplementing the code. For protecting ideas there is patents, that by the way their usage rights are explicitly transfered with GPLv3 code, so if you release a program under GPLv3 that implement patented ideas you transfer the rights to use them! So stop telling "GPL-virus bullshit" I'm sorry. Of course if you looking it proprietary code and agreed to their harmful-protect-anything-I-can licenses it is other story This does not work for case proprietary->open-source but it works fine in way open-sources->proprietary/other license as long as you do not copy the code. Of course if you look and code and rewrite it line by line it is other story, but I believe it is not the case. Ideas and algorithms are not subject to copyrights (they subject to patents) And BTW GMP is LGPL and not GPL so it is perfectly friendly to closed source software (not as Boost, but friendly) Artyom P.S.: I'm not a lawyer don't see the above as legal advice

Chad Nelson

2:34 p.m.

On Wed, 9 Mar 2011 22:52:22 -0800 (PST) Artyom <artyomtnk@yahoo.com> wrote:

...

Which is fine, until someone decides that collaboration isn't as much fun as getting a big unearned paycheck at the expense of someone else.

...

If someone can show that I've looked at other code, and that mine is substantially similar (which may happen without any conscious intent whatsoever), some lawyer can twist it into a case that I owe someone else money for it. With the lawyer getting a big cut of it, of course.

...

You've never dealt with the US legal system, have you? :-) There are something like ten times the number of lawyers per capita in the US than anywhere else in the world, and a few of them are always sniffing around for anything they could possibly wring some money out of. I want my code to be so provably far out of lawyers' sights that they won't even think about it. -- Chad Nelson Oak Circle Software, Inc. * * *

Jeffrey Lee Hellrung, Jr.

9 Mar 9 Mar

7:33 p.m.

On 3/9/2011 7:19 AM, Chad Nelson wrote:

...

For the Schönhage–Strassen multiplication algorithm, I found the following paper about the GMP implementation to be very illuminating (and doesn't have the errors that the wikipedia entry has!): http://www.loria.fr/~gaudry/publis/issac07.pdf I'm not a lawyer, but I would think it's fine to base an implementation off the description given in that paper. - Jeff

Chad Nelson

10 Mar 10 Mar

3:42 a.m.

On Wed, 09 Mar 2011 11:33:38 -0800 "Jeffrey Lee Hellrung, Jr." <jhellrung@ucla.edu> wrote:

...

Noted, thanks.

...

I'm not a lawyer, but I would think it's fine to base an implementation off the description given in that paper.

Yes, I've never even heard a hint of anyone successfully claiming to own an algorithm. -- Chad Nelson Oak Circle Software, Inc. * * *

Marshall Clow

4:04 a.m.

On Mar 9, 2011, at 7:42 PM, Chad Nelson <chad.thecomfychair@gmail.com> wrote:

...

Do the names Rivest, Shamir, and Adelman sound familiar? -- Marshall

Arash Partow

5:07 a.m.

On 10/03/2011 3:04 PM, Marshall Clow wrote:

...

To be honest/pedantic all they did was patent/practically-apply a 2000 year old mathematical theorem. I don't think they ever "owned" the algorithm per-say, but rather they were awarded a monopoly over its use for about a decade or two.

Gordon Woodhull

9 Mar 9 Mar

8:23 a.m.

Dear Chad, Volodya, all, Please allow me to submit a fraction of a vote, as someone who does not foresee a personal need for big integers, but who has a general interest in data structures, algorithms, and abstraction. With apologies to those who are eager to get a big integer library into Boost ASAP, I have to vote No at this time, but with encouragement to Chad to resubmit when the library has gone through another round of revisions.

...

- What is your evaluation of the design?

The STL and its offspring are founded on the principle of separating data structures from algorithms. IMO this is important not just so that other representations of big integers could be used, but also for clarity of code.

...

- What is your evaluation of the implementation?

In addition to the extraction of algorithms, which will probably cause interface changes (or at least the exposure of a few more interfaces and concepts), the possible elimination of COW and introduction of expression templates both seem like big enough changes to merit a second review. I don't like the idea that the library has second-class fixed-size large integers which don't sound like they were fully thought out. But I suppose it is okay for a library to have experimental parts as long as the main thrust of it makes sense.

...

- What is your evaluation of the documentation?

I only took a brief look. I too am wary of documentation that makes big claims about speed and its own completeness -- I guess I don't like marketing-speak or any language that invites people to poke holes in it.

...

- What is your evaluation of the potential usefulness of the library?

Again, I have to emphasize that I am probably not a potential user of this library and hope my vote will be discounted accordingly. But I understand that huge integers are useful to a lot of people. ;-)

...

- Did you try to use the library?

No.

...

- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?

Aside from tracking down what virtual inheritance was used for and closely following the discussion, not a lot of effort.

...

- Are you knowledgeable about the problem domain?

Not really. I attempted to write a bigint for an interview and learned how difficult it is to get right! Besides making the code the best it can be, I think a formal review really shows how well an author can cooperate with the community and how willing they are to change their code when better ideas come along. I am confident in Chad on both points, but I think enough changes are due that a second review would be productive, especially once there are container/algorithm interfaces. On an aside, I'm a little concerned to read twice this week about people finding their designs becoming muddled by peer review. That's not what should happen! It seems the review process can deter uncooperative authors, but I guess it is up to individuals to remain resilient against bad ideas. I hope that everyone will stand up for authors when it looks like they're getting railroaded into bad designs. Cheers, Gordon

Chad Nelson

3:35 p.m.

On Wed, 9 Mar 2011 03:23:52 -0500 Gordon Woodhull <gordon@woodhull.com> wrote:

...

Thank you for the review and comments. Most of them don't require a reply, but at the end, you added:

...

When at least some of the "peers" seem to have a lot more knowledge in certain areas than you do, it's hard to properly evaluate their suggestions, especially when they're not explicitly phrased as suggestions. Brilliance and BS are indistinguishable to someone who doesn't have a complete grasp of all the issues involved. You can waste a lot of valuable programming time in trying to understand a suggestion at a deep enough level to agree or refute it, only to find that the original suggester had no idea what he was talking about. -- Chad Nelson Oak Circle Software, Inc. * * *

Dick Bridges

4:21 p.m.

...

Chad, as an 'old guy' and nothing more than a lurking outsider, allow me to offer an alternative interpretation: You never really know a topic until you try to teach it - or, in this case, 'sell' a library to the group at boost.org. What you refer to as a 'waste', I would interpret as 'due diligence'. More than simply adding some library, IMHO you are, in a very real sense, in the process of establishing yourself as the "go to" person for large integers within the boost community. That's a lofty goal and will require that you NEVER stop learning. (A word of warning though: if you thought the last 5-6 months effort was a lot of work on your part, think about the next 5-6 *YEARS* if|when your library is accepted.) Just my $.002 Dick Bridges Western Digital

Chad Nelson

4:57 p.m.

On Wed, 9 Mar 2011 08:21:49 -0800 "Dick Bridges" <Dick.Bridges@wdc.com> wrote:

...

...
Brilliance and BS are indistinguishable to someone who doesn't have a complete grasp of all the issues involved. You can waste a lot of valuable programming time in trying to understand a suggestion at a deep enough level to agree or refute it, only to find that the original suggester had no idea what he was talking about.

Chad, as an 'old guy' and nothing more than a lurking outsider, allow me to offer an alternative interpretation: You never really know a topic until you try to teach it - or, in this case, 'sell' a library to the group at boost.org.

I can attest to that.

...

What you refer to as a 'waste', I would interpret as 'due diligence'. More than simply adding some library, IMHO you are, in a very real sense, in the process of establishing yourself as the "go to" person for large integers within the boost community. That's a lofty goal and will require that you NEVER stop learning.

If it means that my arrog^H^H^H^H^H *confidence* becomes warranted on the subject, that's probably a good thing. ;-) I do enjoy learning new programming techniques, despite any impression that my earlier... instability?... might have given. Programming has been my hobby since childhood, and making it the means of earning my living as well hasn't diminished my enjoyment of it.

...

(A word of warning though: if you thought the last 5-6 months effort was a lot of work on your part, think about the next 5-6 *YEARS* if|when your library is accepted.)

The months of concentrated effort, to the exclusion of essentially everything else, were a lot of work. By comparison, maintaining and improving it should be a snap.

...

Just my $.002

And much appreciated. -- Chad Nelson Oak Circle Software, Inc. * * *

Steven Watanabe

12 Mar 12 Mar

5:21 a.m.

AMDG On 03/09/2011 12:23 AM, Gordon Woodhull wrote:

...

...
- What is your evaluation of the design?

The STL and its offspring are founded on the principle of separating data structures from algorithms. IMO this is important not just so that other representations of big integers could be used, but also for clarity of code.

IMHO, trying to make a bigint "just like the STL" is simply misguided. Addition, subtraction, multiplication, and division aren't algorithms that operate on a bigint. They're the basic operations that define an integer. Let's consider why we want to separate the data structures from the algorithms again. The basic reason is that this allows new data structures and new algorithms to be added independently. Does this apply here at all? I'm not seeing it: a) New data structures: The implementation of the operators depends on the details of how the numbers are represented. If you wanted a representation that differed from what the library provides in any significant way, you'd have to reimplement them anyway. b) New algorithms: Are the basic operations that are provided not enough for some algorithm that you care about? Concepts should be created by abstracting concrete code. If you try to define the interface by guessing at what some hypothetical user might want, you'll end up with something much more complex than it needs to be. In Christ, Steven Watanabe

Thomas Klimpel

12:37 p.m.

Steven Watanabe wrote:

...

I'm not so sure about whether "division" is a basic operation with respect to an integer. An "efficient" implementation of "division" might be required as a building block for "efficient" implementations of other algorithms (like GCD or LCM), but I wouldn't say that "division" is required to "define" an integer. An additional complication with respect to division is the "remainder". At least some algorithms for "division" will compute the "remainder" at the same time as the "quotient". So should the division be defined to yield both "remainder" and "quotient" at the same time? And how should "division" be defined exactly for negative numbers? Should the "remainder" always be a positive number, or should the absolute value of the "remainder" be as small as possible, or should the sign of the numbers be mostly ignored for division (that's what C/C++ try to do)? The point I'm trying to convey here is that it's difficult to exactly determine the "fundamental" operations/algorithms ("fundamental" implies here that they work directly with the internally representation of the integer) that must be provided as basic building blocks for other operations/algorithms.

...

We should be aware that one part of the "big integer" library problem comes from a deficiency of C/C++ with respect to how signed and unsigned integral numbers have been abstracted. From my point of view, this is the underlying reason why GMP is forced to fall back to assembler code, in order to achieve reasonable performance. But the deficiency of C/C++ in this domain surfaces already much earlier, when working working with negative number and remainders. This typically hits me when working with algorithms using a FFT and related data structures internally, since the nice periodicity features of the FFT data structures are very hard to map to fast and readable C/C++ code. The point how this all is related to the XInt review is that on the one hand, XInt can't be blamed if it tries to follow the example from C/C++ for consistency reasons, but on the other hand, C/C++ is probably a poor example in this specific domain. Even Ada, which nobody likes for good reasons unrelated to this domain, does a much better job than C/C++ here. Regards, Thomas

Kim Barrett

5:52 p.m.

On Mar 12, 2011, at 7:37 AM, Thomas Klimpel wrote:

...

I'm not so sure about whether "division" is a basic operation with respect to an integer. An "efficient" implementation of "division" might be required as a building block for "efficient" implementations of other algorithms (like GCD or LCM), but I wouldn't say that "division" is required to "define" an integer.

An additional complication with respect to division is the "remainder". At least some algorithms for "division" will compute the "remainder" at the same time as the "quotient". So should the division be defined to yield both "remainder" and "quotient" at the same time? And how should "division" be defined exactly for negative numbers? Should the "remainder" always be a positive number, or should the absolute value of the "remainder" be as small as possible, or should the sign of the numbers be mostly ignored for division (that's what C/C++ try to do)?

I'm not entirely certain of this, but I think Thomas is alluding to C89's implementation-defined rounding behavior for division involving negative numbers, which C++98/2003 followed. Note that C99 tightened this up, requiring truncation toward zero behavior. The latest C++0x working draft follows C99 in this respect, also requiring truncation toward zero. I would hope (though haven't looked) that XInt similarly performs truncation toward zero. Regarding quotient / remainder computation, I would think that an appropriate overload of std::div() would suffice. I haven't looked to see if XInt provides such an overload.

...

The point how this all is related to the XInt review is that on the one hand, XInt can't be blamed if it tries to follow the example from C/C++ for consistency reasons, but on the other hand, C/C++ is probably a poor example in this specific domain. Even Ada, which nobody likes for good reasons unrelated to this domain, does a much better job than C/C++ here.

If XInt follows C99 / C++0x here, does that answer your complaint?

Thomas Klimpel

13 Mar 13 Mar

2:45 p.m.

Kim Barrett wrote:

...

Thomas Klimpel wrote:

...
[snip] And how should "division" be defined exactly for negative numbers? Should the "remainder" always be a positive number, or should the absolute value of the "remainder" be as small as possible, or should the sign of the numbers be mostly ignored for division (that's what C/C++ try to do)?

I'm not entirely certain of this, but I think Thomas is alluding to C89's implementation-defined rounding behavior for division involving negative numbers, which C++98/2003 followed.

To be honest, I wasn't even aware of this, so thanks for this interesting information. My goal was just to show the difficulties when trying to define "division" for integers.

...

Note that C99 tightened this up, requiring truncation toward zero behavior. The latest C++0x working draft follows C99 in this respect, also requiring truncation toward zero. I would hope (though haven't looked) that XInt similarly performs truncation toward zero.

OK, that's at least a step forward from "implementation-defined", but the modulo operator ("%") is still awkward to use in practice (for negative numbers).

...

Regarding quotient / remainder computation, I would think that an appropriate overload of std::div() would suffice. I haven't looked to see if XInt provides such an overload.

What would you use as return type for this overload? A std::pair, a boost::tuple, or a specially defined struct (std::div_t) like the other overloads from std?

...

...
The point how this all is related to the XInt review is that on the one hand, XInt can't be blamed if it tries to follow the example from C/C++ for consistency reasons, but on the other hand, C/C++ is probably a poor example in this specific domain. Even Ada, which nobody likes for good reasons unrelated to this domain, does a much better job than C/C++ here.

If XInt follows C99 / C++0x here, does that answer your complaint?

Probably not, because this was a complaint about C/C++, not about XInt. This complaint was in response to Steven asking "Are the basic operations that are provided not enough for some algorithm that you care about?". From my experience with C/C++, I could answer this question with a definitive "Yes, the basic operations provided by C/C++ for integer numbers are not enough for some algorithms that I care about!". I know that this is a review of XInt, but as I have much more experience with C/C++, it makes sense to answer this question for C/C++ instead, which XInt tries to follow for consistency reasons. I gave two specific example (I could give more, but I guess nobody wants to hear them, because it is not realistic to expect that this deficiency of C/C++ will ever be fixed), one directly related to the current review, and the other related to my own daily work. Regards, Thomas

Steven Watanabe

11:06 p.m.

AMDG On 03/13/2011 07:45 AM, Thomas Klimpel wrote:

...

Kim Barrett wrote:

...
Thomas Klimpel wrote: Note that C99 tightened this up, requiring truncation toward zero behavior. The latest C++0x working draft follows C99 in this respect, also requiring truncation toward zero. I would hope (though haven't looked) that XInt similarly performs truncation toward zero. OK, that's at least a step forward from "implementation-defined", but the modulo operator ("%") is still awkward to use in practice (for negative numbers).

If you don't like it, then you can always fix up the result to be what you want. The cost of adjusting the result is going to be a lot less than the cost of the division itself. Not to mention that the existing behaviour is the most natural and efficient to implement for the signed-magnitude representation that the library uses.

...

...
Regarding quotient / remainder computation, I would think that an appropriate overload of std::div() would suffice. I haven't looked to see if XInt provides such an overload. What would you use as return type for this overload? A std::pair, a boost::tuple, or a specially defined struct (std::div_t) like the other overloads from std?

It's already provided by the library. (Okay, it's called divide, not div and it returns integer<...>::divide_t)

...

...
...
The point how this all is related to the XInt review is that on the one hand, XInt can't be blamed if it tries to follow the example from C/C++ for consistency reasons, but on the other hand, C/C++ is probably a poor example in this specific domain. Even Ada, which nobody likes for good reasons unrelated to this domain, does a much better job than C/C++ here. If XInt follows C99 / C++0x here, does that answer your complaint? Probably not, because this was a complaint about C/C++, not about XInt. This complaint was in response to Steven asking "Are the basic operations that are provided not enough for some algorithm that you care about?". From my experience with C/C++, I could answer this question with a definitive "Yes, the basic operations provided by C/C++ for integer numbers are not enough for some algorithms that I care about!". I know that this is a review of XInt, but as I have much more experience with C/C++, it makes sense to answer this question for C/C++ instead, which XInt tries to follow for consistency reasons.

I'm afraid that what you've said doesn't help me at all You've mentioned that you don't like the way division is defined, and you've mentioned FFT's, but you haven't made any attempt to explain how this is a real problem for XInt. This isn't about implementability. An FFT can be implemented using only basic math operations. It's a matter of efficiency. And for this, you can't generalize from the builtin types to XInt because the performance characteristics are significantly different.

...

I gave two specific example (I could give more, but I guess nobody wants to hear them, because it is not realistic to expect that this deficiency of C/C++ will ever be fixed), one directly related to the current review, and the other related to my own daily work.

In Christ, Steven Watanabe

Thomas Klimpel

14 Mar 14 Mar

10:22 p.m.

Steven Watanabe wrote: Thomas Klimpel wrote:

...

That's definitively true. My current workaround for k = modulo(m,n); in places where I'm unable to work around the missing "modulo" functionality of C/C++ is to write k = ((m%n)+n)%n Based on the information from Kim Barrett, k = m >= 0 ? m%n : (m%n) + n; should also work (that's a bit longer than the old form, but looks more efficient, and I could write a "modulo" template encapsulating this functionality for my personal use). If I'm not mistaken, this would be the most efficient workaround in case of XInt.

...

That's definitively true. There are probably also use cases where this definition of division and modulo is the most suitable. It just happens that I regularly encounter use cases where one of the other possible definitions would have been more suitable.

...

I was delighted that you tried to answer to this. I agree that the use cases of XInt and normal C/C++ integers are hardly comparable, so it's probably not possible to generalize from experiences with the builtin types to XInt. (Side note: I wasn't talking about implementing an FFT, but simply about working with the data structures that occur in this context. With this I just mean that the "zeroth" spectral order is stored at a[0], the first positive spectral order is stored at a[1], the first negative spectral order is stored at a[n-1], and similarly for the higher orders. In this context, the modulo operation where the "remainder" is always a positive number would allow me to compute the index when I have the spectral order, and the modulo operation where the absolute value of the "remainder" is as small as possible (and negative in the case where this condition is ambiguous) would allow me to compute the spectral order when I have the index.) The question with respect to XInt was, whether the algorithms should be separated from the data structures, or whether the data structures should be an implementation detail hidden from the public interface. Even so I agree that the algorithms and data structures are not orthogonal to each other for a bigint library, I have my doubts that just saying addition, substraction, multiplication and division are the fundamental operations that a bigint library need to provide, and any other "relevant" algorithm can be ("efficiently"?) implemented in terms of these. This might not be relevant to XInt, because XInt provides more "fundamental operations" than just these "evocative" four. I can quite imagine that XInt indeed has managed to provide all important "fundamental operations", whichever these may be. Regards, Thomas

John Bytheway

11:39 p.m.

On 14/03/11 22:22, Thomas Klimpel wrote:

...

Be careful; not only does that perform two divisions (which is slow), it also gives the wrong answer when m and n are big enough.

...

I tend to use r = m%n; if (r < 0) r += n; which the compiler is more likely to implement using a conditional move, rather than a branch (but it does have the issue that it's not an expression). I did some tests with: int m1(int m, int n) { return ((m%n)+n)%n; } int m2(int m, int n) { return m >= 0 ? m%n : (m%n) + n; } int m3(int m, int n) { int r = m%n; if (r < 0) r += n; return r; } icc 11.1 compiles m2 and m3 to essentially the same code. gcc 4.5.2's version of m2 is longer than m3 and contains a branch (-O3 in all cases). For both compilers m1 contains two divisions (there's no way to optimize it down to one). John Bytheway

Thomas Klimpel

15 Mar 15 Mar

6:48 p.m.

John Bytheway wrote:

...

I have now implemented modulo as template <class T> T modulo(T m, const T& n) { m %= n; if (m < 0) m += n; return m; } and replaced 10 occurrences (all I could find) of (m%n+n)%n with it. Two of these occurrences had the form "(m%std::ptrdiff_t(n)+n)%n", which surprised me a bit. When I replaced them with "modulo(m, n)", the compiler refused to compile it. So I wrote "modulo(m, std::ptrdiff_t(n))" instead, and this compiled fine. Thinking a bit about this, I realized that the template does the right thing by refusing to compile this, whereas the "(m%n+n)%n" construct can lead to surprising results in case n is of type std::size_t. So I think I'm quite happy with my new template. Regards, Thomas

Scott McMurray

7:16 p.m.

On Tue, Mar 15, 2011 at 11:48, Thomas Klimpel <Thomas.Klimpel@synopsys.com> wrote:

...

[...] Two of these occurrences had the form "(m%std::ptrdiff_t(n)+n)%n", which surprised me a bit. When I replaced them with "modulo(m, n)", the compiler refused to compile it. So I wrote "modulo(m, std::ptrdiff_t(n))" instead, and this compiled fine. Thinking a bit about this, I realized that the template does the right thing by refusing to compile this, whereas the "(m%n+n)%n" construct can lead to surprising results in case n is of type std::size_t.

Perhaps the real thing it's telling you is that you want something like this: template <class T> enable_if_c<numeric_limits<T>::is_signed, T>::type modulo(T m, const T& n) { m %= n; if (m < 0) m += n; return m; } template <class T> enable_if_c<numeric_limits<T>::is_signed, T>::type modulo(T m, const T& n) { m %= n; return m; } Because if you really want modulo 4000000000, casting to ptr_diff_t will give very strange results on LP32... ~ Scott

Scott McMurray

7:18 p.m.

On Tue, Mar 15, 2011 at 12:16, Scott McMurray <me22.ca+boost@gmail.com> wrote:

...

Uh, that second enable_if_c should of course be disable_if_c: template <class T> enable_if_c<numeric_limits<T>::is_signed, T>::type modulo(T m, const T& n) { m %= n; if (m < 0) m += n; return m; } template <class T> disable_if_c<numeric_limits<T>::is_signed, T>::type modulo(T m, const T& n) { m %= n; return m; } Oops, ~ Scott

Scott McMurray

14 Mar 14 Mar

11:08 p.m.

On Sun, Mar 13, 2011 at 07:45, Thomas Klimpel <Thomas.Klimpel@synopsys.com> wrote:

...

OK, that's at least a step forward from "implementation-defined", but the modulo operator ("%") is still awkward to use in practice (for negative numbers).

Yes, it's unfortunate that it's generally known as modulo, when it's really the remainder operator, not modulo at all. ~ Scott

Gordon Woodhull

15 Mar 15 Mar

2:59 a.m.

Hi Steven, As I mentioned in my review, I'm neither an expert in big integers nor a potential user of the library. With that disclaimer aside, let me respond. On Mar 12, 2011, at 12:21 AM, Steven Watanabe wrote:

...

I'm not sure how you can say they aren't algorithms; I guess you mean that the algorithms shouldn't be exposed. I don't see anything mutually exclusive about algorithms and operators: there are algorithms that run on ranges of numeric data at one level, and there are classes that look like ints on another level. I tend to see all such problems as data + algos at one level, and then some syntactic sugar (whether proto or just classes with operators) at another level. This allows there to be more options in the data & algos than are needed in normal usage. From a quick glance at the code, it looks like this design is already present: all the work is actually done by free functions on "raw" integers, and then integer_t is a nice wrapper. Whether this needs to be in the public interface is debatable, and as I argued in my review, reviews certainly shouldn't create useless work which muddies the design. I hope that everyone who expressed this opinion wasn't just jumping on a bandwagon on hearing a good-sounding slogan. I don't think so.

...

I think it boils down to a few essential questions: Do people need to apply the +-*/ algorithms to data which is not in an integer_t data structure? It does seem that it might be easier to implement the fixed-size large integer if the algos were separate from the data, but it's not the only way. Are there other algorithms besides these 4 which need access to the binary data? Are there reasons to replace the +-*/ algorithms? Cheers, Gordon

Paul A. Bristow

11 Mar 11 Mar

10:12 a.m.

...

-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Vladimir Prus Sent: Wednesday, March 02, 2011 10:58 AM To: boost@lists.boost.org; boost-users@lists.boost.org Subject: [boost] [xint] Boost.XInt formal review

Boosters,

the formal review of the Boost.XInt library, by Chad Nelson, starts now and will run through March 11.

...

3. Please explicitly state in your review whether the library should be accepted. =================================================================

Although there is work to do, I think this will be a useful Version 1. So I vote for acceptance. In view of the several skeletons on the Boost road to a BigInt, I think it would be unfortunate to risk losing this one. I would like to see a user base and applications, and more testing on more platforms. A possible using-xint project (GSoC?) might be implementation of reading floating point (which require big (-ish) integers) of William D. Clinger http://www.cesura17.net/~will/Professional/Research/Papers/howtoread.pdf http://portal.acm.org/citation.cfm?id=93557 and writing FP by Steele and White http://grouper.ieee.org/groups/754/email/pdfq3pavhBfih.pdf I am sure that there are many others. What is your evaluation of the design? ===================================== It seems to have most of the features needed, 'fixed' and 'elastic'. I am not qualified to judge the 'presentation' (or the implementation), but from the discussions, I am absolutely clear that it is not at all simple. Some conflicts between runtime speed, space, expression templates, movability, compile time, COW, CRTP, thread-safety and others seem inevitable. A single 'ideal' version may even prove impossible. Macros to switch implementation are a messy solution but may be the only way. I do not believe we are anywhere near a C++ Standard candidate model, so this is not yet a consideration for me. I would not rule out Xint2, SlickerInt or even UltimateInt, because I believe the many differences of views expressed can only be resolved by seeing some more persuasive evidence. Since GMP is a 'gold standard' (but, for some, useless because of license limitations) I highly value correctness above speed or space (and the unfettered Boost License terms). Portability and correctness are of paramount importance for a Boost library. <aside> I feel this review shows that our process is seriously broken. IMO there has been too little recognition of the hard work put in by the author, and too much noisy asserting and hand-waving arguments which are not sufficiently persuasive. Since there has already been considerable discussion about Xint some time ago, if people feel the design is fundamentally wrong, I feel they should put much more effort into at least sketching out their 'XintBetter' and working to provide evidence of the pros (and cons) of their suggested implementation. There are far too few reviews, especially those with hard experience of using the library 'in anger'. The review process is taking place over far too short a time period, with no time to attempt to take suggestions on board and trying to implement them. For a difficult issue like this, we are not going to get to a good (let alone the best) solution on the first iteration. Why has no reviewer actually tried to use it with UBlas, ET ...? There is much speculation, not all with the marvellous modesty of Eric Niebler "But I haven't thought it all through, so that could all be bollocks. :-)". We could use a lot more of that attitude? We have too little evidence of portability - libraries probably need to be tested in the Boost Test System to get a good idea of all platforms and compilers. *We need this before review*, not just testing a single GCC and VS 2008. </aside>

...

- What is your evaluation of the implementation?

As has been suggested already, I think the random and cypto applications are muddying the water at this stage. They could perhaps be moved to the /example folder for now? Or to Boost.Random?

...

- What is your evaluation of the documentation?

OK. Too chatty in places. Some of the "Why would I use it" should go. I *really* like the *full* use of Doxygen to document each function and class with parameter description, returns, exceptions ... (Too many otherwise well-documented Boost libraries do not use Doxygen documentation system to the full: you get the structure and names, but finding what they *do* is much more difficult, and to often impossible.). (Nevertheless, I would have much preferred to be able to get a PDF version (implies using Quickbook with Doxygen and Autoindex toolchain - becoming a Boost 'standard'). If the library is accepted, I would encourage the author to use the existing material to create this. It would be a small extra task (re-)using everything done so far but will make the docs more widely available and more useful too (and help can be given). I would also like more examples showing various features. These are often the quickest way for users to see what to do. What is your evaluation of the potential usefulness of the library? ====================================================== Essential - for some tasks. But so also is 'XReal' - an even bigger can of worms :-( Did you try to use the library? ======================= VS 2010. Ran the tests and an example OK. Tests look OK - though I am sure that more could be added, including more corner cases and perhaps spot values from other packages. More comment on the specific 'difficult' case being tested would be helpful? How much effort did you put into your evaluation? =========================================== A quick try-out and read-through. Are you knowledgeable about the problem domain? ========================================== No. Paul Some minor comments. =================== It is not possible (or even desirable) to try to *prevent* access to implementation details. Putting them in a /details folder and documenting their status is sufficient at this stage. 1 _test I've not a fan (nor is it is Boostish) to start function names with _. Some people use the MS-style int m_x; for member data, and others use _x or x_ for member data. But that's a style issue. 2 A jamfile to build all the examples would be helpful? 3 Typo in comment This class implements the standard aribitrary-length %integer type. line 29 in integer.hpp 4 I tried to run the jamfile, but it failed not finding boost/xint/integer.hpp 5 I converted the msvc sln to VS 2010 (OK) and added a property page with an include directory for "boost-sandbox/xint" but still got trouble with #ifndef BOOST_INCLUDED_XINT_INTEGER_HPP #define BOOST_INCLUDED_XINT_INTEGER_HPP These should perhaps be #include <boost/xint/detail/internals.hpp> #include <boost/xint/random.hpp> 6 Personally I'd like to see the output from examples appended as a comment like this: /* The number is: 72 The huge number is: 123456789012345678901234567890123456789012345 (That's a 147-bit number.) Press any key to continue . . . */ 7 I tripped on \Za - MS language extensions are essential for random (I think). I added to the jamfile to ensure this (and quiet a load of silly warnings) project : requirements <toolset>msvc:<cxxflags>/wd4127 # expression is constant. -<toolset>msvc:<cxxflags>/Za # Ensure language extensions are enabled (essential only for random?) ; 8 I noted some includes that should perhaps be <boost/xint/???.hpp> or <boost/xint/detail/???.hpp> --- Paul A. Bristow, Prizet Farmhouse, Kendal LA8 8AB UK +44 1539 561830 07714330204 pbristow@hetp.u-net.com

Chad Nelson

3:29 p.m.

On Fri, 11 Mar 2011 10:12:57 -0000 "Paul A. Bristow" <pbristow@hetp.u-net.com> wrote:

...

...
3. Please explicitly state in your review whether the library should be accepted. =================================================================

Although there is work to do, I think this will be a useful Version 1. So I vote for acceptance.

Thank you for your review, and your comments.

...

[...] A possible using-xint project (GSoC?) might be implementation of reading floating point (which require big (-ish) integers) of William D. Clinger [...] and writing FP by Steele and White [...]

So I'm not the only one who's had trouble with that. :-) Thanks for the references.

...

...
- What is your evaluation of the implementation?

As has been suggested already, I think the random and cypto applications are muddying the water at this stage. They could perhaps be moved to the /example folder for now?

They could, but I don't think that's really appropriate. As I mentioned in another of my all-too-numerous messages this week, there really isn't any crypto code in the library at all. There are only mathematical functions. is_prime is the closest thing to a cryptographic function, and it has non-cryptographic applications as well.

...

Or to Boost.Random?

Steven Watanabe has indicated that he'd rather not put any XInt-specific code into Boost.Random, and I agree. It's really more logically implemented in XInt, and it's just annoying enough to do right that I'd prefer not to force users of the library to reimplement it any time they need it.

...

I *really* like the *full* use of Doxygen [...] (Nevertheless, I would have much preferred to be able to get a PDF version [...]

Once the documentation is closer to its final form, I would be open to that.

...

I would also like more examples showing various features. These are often the quickest way for users to see what to do.

Already on the to-do list.

...

1 _test

I've not a fan (nor is it is Boostish) to start function names with _. Some people use the MS-style int m_x; for member data, and others use _x or x_ for member data. But that's a style issue.

I'm not a fan of it either. I generally only use it to help indicate that even though a function or variable might be in the public interface, it's meant for use only by the library.

...

3 Typo in comment This class implements the standard aribitrary-length %integer type. line 29 in integer.hpp

Fixed, thanks.

...

5 I converted the msvc sln to VS 2010 (OK)

and added a property page with an include directory for "boost-sandbox/xint"

but still got trouble with

#ifndef BOOST_INCLUDED_XINT_INTEGER_HPP #define BOOST_INCLUDED_XINT_INTEGER_HPP

Puzzling... those are standard include guard lines, they shouldn't be able to give anyone any trouble.

...

7 I tripped on \Za - MS language extensions are essential for random (I think).

The MS language extensions are actually only essential for the Windows API headers, which are necessary for the strong_random_generator class on that platform.

...

I added to the jamfile to ensure this (and quiet a load of silly warnings) [...]

I've added a pragma in the code to quiet that specific warning within the XInt headers, and adopted your /Za-disabling line.

...

8 I noted some includes that should perhaps be <boost/xint/???.hpp> or <boost/xint/detail/???.hpp>

I'm not sure I understand, can you clarify that one? Again, thank you for the review. -- Chad Nelson Oak Circle Software, Inc. * * *

Kim Barrett

5:55 p.m.

Not really about XInt, but responding to a side comment in Paul Bristow's review. On Mar 11, 2011, at 5:12 AM, Paul A. Bristow wrote:

...

The Grisu algorithm family doesn't require bignums. The results are better (as in "shorter") if there are more bits in the (fixed) integer representation than in the floating point significand. The paper focuses on printing IEEE doubles using 64 bit integers. According to the paper, Grisu2 produces the shortest output for approximately 99.9% of the inputs, and a longer but still correct output otherwise. Grisu3 (intended for use when shortest output is a requirement) produces the shortest result for 99.5% of IEEE doubles, providing a failure indication on the remainder so that one can fall back to some other algorithm (such as Steele & White's Dragon4, which does require bignums). Printing Floating-Point Numbers Quickly and Accurately with Integers Florian Loitsch PLDI'10

Paul A. Bristow

6:43 p.m.

...

than in the

...

Looks most interesting, but I don't have access to ACM. Can anyone provide the pdf for my private study ;-) Does anyone think using this for a Boost C++ version is a GSoC project, or are there patent or other IP issues? Thanks Paul