
Here is my review for XInt. Currently I think the library should not be included in boost as a lot of problems are, in my opinion, still to be fixed and the design has to be rethought. *********************************************************************** * What is your evaluation of the design? Large integer is a bit like XML and Unicode. Lots of people want to do such a library because it has a high impact of usefulness and, from afar, it doesn't sounds *this* complicated. Alas, it is :/ The premises are OK. We have some large integer type that has proper value semantic and can be used in a variety of operations and functions. So far, so good. The main culprit is that the current library design is rather hardwired to work with a choice of implementation the author thought to be OK at some point but which still have to be proven better than existing solution. The overall design lacks separation between the data and the algorithm that makes the library difficult to extend: what if i want to take an arbitrary range of integers and look at them as a huge series of bits ? What if i don't want the extra memory overhead so i can mmap some files directly into memory (cause reading a 10k bits integer is a no-no) ? Can I pass the integer type to a STD algorithm if I want to iterate over the bytes it contains ? or copy them using boost::range copy ? On the data side, i think being forced to use a given representation is wrong. The Ideal should be that the library provide a fall back implementation but provide adaptor to adapt any kind of data to be looked at as a large integer. The definition of a LargeInteger concept would help defining such an adapting interface and make the library more fluid. For example, what if i want to implement LCM ? * What is your evaluation of the implementation? The implementation is poor for various reasons: - author uses COW for performance reasons. Alas COW has been demonstrated again and again to be of little interest. Looking at the problem 10s (types holding arbitrary large amount of data to be chained in operations without copy) should have lead to an expression template based solution. Capture a whole chain of operations on values, compute the worst case nbr of bits to store, allocate once then compute inside. It prevent spurious allocation and make the algorithms easier to write. Additionally, ET would help detecting expression pattern for which a non naive algorithm could be applied. - virtual inheritance: looking at integer_t shows: class integer_t: virtual public .... Why ? It seems you're trying to implement some policy based design but i cant see why virtual inheritance is used ? Do you get problem not having multiple definition in your mixins ? - There is a lot of unwarranted pass-by-value function prototype which seriously impact performance even with COW. pass-by-ref > pass-by-value + COW. Moreover, the "thread safety" COW is dubious. - the headers are obscenely big. It could be easier to follow the code it there was not this 2k+ line mammoth files. * What is your evaluation of the documentation? The documentation is OK but it is rather hard to navigate as there is real links between sections, forcing me to go back and forth between sections. * What is your evaluation of the potential usefulness of the library? The problem the library tries to solve is a very important one as shown by the large amount of similar project. Having such a library is worthy of inclusion in boost but not in the present form and shape. * Did you try to use the library? With what compiler? Did you have any problems? I tried some examples from the doc on gcc 4.5 with no real problems. * How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? * Are you knowledgeable about the problem domain? I've been doing research and development on scientific computing for 9 years. If the specific problem of large integer is a bit of a fringe domain for me, I have been collaborating on such design in some projects. * Do you think the library should be accepted as a Boost library? No. In its current form, the library *should not* be accepted into Boost. performances claims are not backed up (and other reviewers show it can be far better0 the design lacks of genericity and its implementation is , in my opinion, sub-optimal. To be accepted the following point should be addressed: - use expression template to handle efficient combination of operations and potentially detect optimisable pattern. Proto is a shoe-in for this. - the library entities should be made more generic by providing an extended integer adaptor facility that make one able to substitute a custom integer representation (like having extended_integer<GMP>) for corner case stuff. This adaptor will provide an interface able to turn these types into a type fulfilling the underlying Large Integer concept the library need to handle. Being compatible with GMP should even be a goal and a selling point for the library. - do real benchmarks and compare to existing state of the art solution, at least show benchmarks against GMP on realistic use cases. - the algorithm should be more generic and not be tied to the concrete type of large integer used. Frankly, licensing issues aside, a proto based library able to wrap GMP if needed and with Range based interface could already cover 80% of needs and provides correct speed. Then building on top of that to add new algorithm and new back end representation and adaptors would fill the library in a very smooth way.

On 4 Mar 2011, at 09:49, Joel Falcou wrote:
- use expression template to handle efficient combination of operations and potentially detect optimisable pattern. Proto is a shoe-in for this.
Are you sure that the gains from proto would outweigh the losses? I just tried taking an example from the proto directory and added an invalid '+ 2.0' on the end of it. The resulting error message was 29k. The same error from xint was much shorter and understandable. Chris

On 04/03/11 12:09, Christopher Jefferson wrote:
On 4 Mar 2011, at 09:49, Joel Falcou wrote:
- use expression template to handle efficient combination of operations and potentially detect optimisable pattern. Proto is a shoe-in for this.
Are you sure that the gains from proto would outweigh the losses?
I just tried taking an example from the proto directory and added an invalid '+ 2.0' on the end of it. The resulting error message was 29k.
The same error from xint was much shorter and understandable.
Proto examples are -wait for it- examples. Proper proto based code (and by proper I mean usable in the real world) have grammar check that output a terse static assert in such cases. I'll redirect you to Eric Niebler talk at last year Boost'con, to his C++-next article on the subject and to the end of my own Boost'con talk where we show how we handled it in our pro based libraries. Huge error message coming from a library built with template should be considered bug and ought to be fixed. Short story, if done properly, yes it will outweight the losses.

On Fri, 04 Mar 2011 10:49:25 +0100 Joel Falcou <joel.falcou@lri.fr> wrote:
Here is my review for XInt. Currently I think the library should not be included in boost as a lot of problems are, in my opinion, still to be fixed and the design has to be rethought.
Thank you for the review. I'll answer some of your comments below.
The premises are OK. We have some large integer type that has proper value semantic and can be used in a variety of operations and functions. So far, so good. The main culprit is that the current library design is rather hardwired to work with a choice of implementation the author thought to be OK at some point but which still have to be proven better than existing solution.
What existing solution? So far as I can see, no one else has been willing to do the work needed to offer one, only criticize, condemn, and complain about the one I've offered.
The overall design lacks separation between the data and the algorithm that makes the library difficult to extend: what if i want to take an arbitrary range of integers and look at them as a huge series of bits ?
Why would you? Give me a viable use-case, and if it fits the library's design criteria, I'll be happy to try to make it happen.
What if i don't want the extra memory overhead so i can mmap some files directly into memory (cause reading a 10k bits integer is a no-no) ?
Use the to_binary function.
Can I pass the integer type to a STD algorithm if I want to iterate over the bytes it contains ? or copy them using boost::range copy ?
No, you cannot. Deliberately. If you were able to, then the interface would be permanently stuck at a version-1 state, because if I made any future changes to it, it would break existing client code. In my opinion, the freedom to improve the internal design without breaking client code is far more desirable than giving client code free rein to mess around in its internals.
On the data side, i think being forced to use a given representation is wrong. The Ideal should be that the library provide a fall back implementation but provide adaptor to adapt any kind of data to be looked at as a large integer.
Your opinion has been noted. The code is not ideal, but it works, it's available, and it can (and will, if not rejected) be improved. If you're waiting for the perfect library to come along before you'll accept it, you're going to be waiting forever.
The definition of a LargeInteger concept would help defining such an adapting interface and make the library more fluid. For example, what if i want to implement LCM ?
Point A, you don't have to, LCM is already in there. Point B, it can handle essentially any mathematical operation that can be expressed as a combination of existing operations. No one has yet provided any concrete example of a mathematical operation that *can't* be done using the operators and functions already available, and I suspect you'd have to get pretty esoteric to find one.
- There is a lot of unwarranted pass-by-value function prototype which seriously impact performance even with COW. pass-by-ref > pass-by-value + COW.
I've already said that I plan to change this, several times.
Moreover, the "thread safety" COW is dubious.
Then you're don't understand it.
- the headers are obscenely big. It could be easier to follow the code it there was not this 2k+ line mammoth files.
Yes, I could break it up into 2K one-line files. Somehow I think that would be even harder to follow.
In its current form, the library *should not* be accepted into Boost. performances claims are not backed up (and other reviewers show it can be far better0 the design lacks of genericity and its implementation is , in my opinion, sub-optimal.
Would it be easier on everyone if I just deleted that line, since most people are bikeshedding about it?
To be accepted the following point should be addressed: [...]
- do real benchmarks and compare to existing state of the art solution, at least show benchmarks against GMP on realistic use cases.
GMP is a collaboration of dozens of developers, contains years of work, and includes low-level optimizations that go against XInt's design goals. XInt is the product of one developer over five months. Do you measure a child's running speed against that of an Olympic gold medalist? -- Chad Nelson Oak Circle Software, Inc. * * *

Oook ... well, first i want to stress out i am not here to get personal at you Chad. This is a review process not a crusade. Here's comment on your comments. Consider unanswered comments as being accepted by me as valid.
Why would you? Give me a viable use-case, and if it fits the library's design criteria, I'll be happy to try to make it happen.
Well, let's say i just have some file format storing large integer this way or receiving large integer over a network by unconnected packet. It's a variation of reading the int value from a string like "45748764465768754564634687314313270".
No, you cannot. Deliberately. If you were able to, then the interface would be permanently stuck at a version-1 state, because if I made any future changes to it, it would break existing client code. In my opinion, the freedom to improve the internal design without breaking client code is far more desirable than giving client code free rein to mess around in its internals.
No. you shield yourself from this by having a proper iterator/range types that hides said details of your internals. Code works with std::vector iterator everywhere and the concept iterator makes it possible to work even if w/e vendor decide to implement quantum information storage instead of contiguous one.
Your opinion has been noted. The code is not ideal, but it works, it's available, and it can (and will, if not rejected) be improved. If you're waiting for the perfect library to come along before you'll accept it, you're going to be waiting forever.
I am not. I am pointing out the fact that the current design is still requiring concept refinement so you could : a/ propose your current implementation as a basic, free for all solution b/ enable mapping to other representation with extension points
Point A, you don't have to, LCM is already in there.
So i missed it. Sorry
Point B, it can handle essentially any mathematical operation that can be expressed as a combination of existing operations. No one has yet provided any concrete example of a mathematical operation that *can't* be done using the operators and functions already available, and I suspect you'd have to get pretty esoteric to find one.
OK, so let me rephrase. What if i want to implement a function that has a non trivial algorithm which is faster than the usual combo of operations ?
Then you're don't understand it. Maybe, but I'll wager I do. Other concern is sharing data like this between threads may lead to memory contention and/or false sharing if data fit incomplete cache lines.
Yes, I could break it up into 2K one-line files. Somehow I think that would be even harder to follow.
20 x 100 lines is OK in my book
Would it be easier on everyone if I just deleted that line, since most people are bikeshedding about it?
Yes.
GMP is a collaboration of dozens of developers, contains years of work, and includes low-level optimizations that go against XInt's design goals. XInt is the product of one developer over five months. Do you measure a child's running speed against that of an Olympic gold medalist?
No I am not, you are by actually putting the dreaded "it is fast" line in the documentation. Development is not a fire and forget process, even more so if you go over some task like this one where people has tons of diverging use cases and expectations. I am not slapping you with a large trout for fun, i am just trying to show you that, if made more generic, you can still have your current impl and let said people with unusual needs be able to tweak the library from the outside to fulfill them.

On Fri, 04 Mar 2011 16:19:45 +0100 Joel Falcou <joel.falcou@lri.fr> wrote:
Oook ... well, first i want to stress out i am not here to get personal at you Chad. This is a review process not a crusade.
Yes, and I'm trying to keep that in mind. I apologize for any frustration that leaks out.
Why would you? Give me a viable use-case, and if it fits the library's design criteria, I'll be happy to try to make it happen.
Well, let's say i just have some file format storing large integer this way or receiving large integer over a network by unconnected packet. It's a variation of reading the int value from a string like "45748764465768754564634687314313270".
And what interface do you envision to allow this?
No, you cannot. Deliberately. If you were able to, then the interface would be permanently stuck at a version-1 state, because if I made any future changes to it, it would break existing client code. In my opinion, the freedom to improve the internal design without breaking client code is far more desirable than giving client code free rein to mess around in its internals.
No. you shield yourself from this by having a proper iterator/range types that hides said details of your internals. [...]
Then I misinterpreted your argument, for which I apologize. Yes, I can provide a (read-only) iterator interface to the digit_t values of an integer. I don't think that will limit future development much. I've put it on the to-do list. If you want more than read-only, then even if the library provides it, it will have to be marked as officially unsupported and subject to incompatible changes without warning in future versions.
[...] Point B, it can handle essentially any mathematical operation that can be expressed as a combination of existing operations. No one has yet provided any concrete example of a mathematical operation that *can't* be done using the operators and functions already available, and I suspect you'd have to get pretty esoteric to find one.
OK, so let me rephrase. What if i want to implement a function that has a non trivial algorithm which is faster than the usual combo of operations ?
If such a function isn't use-specific, I assume it would be more at home within the library than as an add-on, and I'd suggest providing a patch if you're willing and able to offer it to the public. If not, then you're right, the only way to do it well would be with read/write access to the internals.
Then you're don't understand it.
Maybe, but I'll wager I do. Other concern is sharing data like this between threads may lead to memory contention and/or false sharing if data fit incomplete cache lines.
It isn't meant to be shared between threads. The internal code is single-threaded, and if you enable this on the public interface, you explicitly *dis*able the library's thread safety.
GMP is a collaboration of dozens of developers, contains years of work, and includes low-level optimizations that go against XInt's design goals. XInt is the product of one developer over five months. Do you measure a child's running speed against that of an Olympic gold medalist?
No I am not, you are by actually putting the dreaded "it is fast" line in the documentation.
And most people who have commented on it are misinterpreting it as a claim that "it is faster than..." or "it is the fastest." It says neither. That line explains an important design criteria of the library.
Development is not a fire and forget process, even more so if you go over some task like this one where people has tons of diverging use cases and expectations. I am not slapping you with a large trout for fun, i am just trying to show you that, if made more generic, you can still have your current impl and let said people with unusual needs be able to tweak the library from the outside to fulfill them.
It's not an excuse, but I'm very short of sleep and more than a little frustrated. Please forgive the tone of my previous reply. -- Chad Nelson Oak Circle Software, Inc. * * *

On 04/03/11 18:55, Chad Nelson wrote:
Why would you? Give me a viable use-case, and if it fits the library's design criteria, I'll be happy to try to make it happen. Well, let's say i just have some file format storing large integer this way or receiving large integer over a network by unconnected packet. It's a variation of reading the int value from a string like "45748764465768754564634687314313270". And what interface do you envision to allow this?
I have to give it a deep thought so it looks like a vlaid proposal. I'll come back at you on this.
If you want more than read-only, then even if the library provides it, it will have to be marked as officially unsupported and subject to incompatible changes without warning in future versions. Read only is all what is needed IMHO If such a function isn't use-specific, I assume it would be more at home within the library than as an add-on, and I'd suggest providing a patch if you're willing and able to offer it to the public. If not, then you're right, the only way to do it well would be with read/write access to the internals. Well, look at spirit or other stuff like that. Havign extension points or proper process to add X or Y is a very fine additionin any lib.
It's not an excuse, but I'm very short of sleep and more than a little frustrated. Please forgive the tone of my previous reply Take a break this week end, sleep well and everythign will be more calm and clean for arguing monday.

On 04/03/2011 18:55, Chad Nelson wrote:
It isn't meant to be shared between threads. The internal code is single-threaded, and if you enable this on the public interface, you explicitly *dis*able the library's thread safety.
Sharing data between threads is not the same as accessing the same data concurrently from multiple threads.

At Fri, 04 Mar 2011 19:20:43 +0100, Mathias Gaunard wrote:
On 04/03/2011 18:55, Chad Nelson wrote:
It isn't meant to be shared between threads. The internal code is single-threaded, and if you enable this on the public interface, you explicitly *dis*able the library's thread safety.
Sharing data between threads is not the same as accessing the same data concurrently from multiple threads.
In what way are those two things... two things? As far as I can tell they mean exactly the same. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On 04/03/2011 21:15, Dave Abrahams wrote:
At Fri, 04 Mar 2011 19:20:43 +0100, Mathias Gaunard wrote:
On 04/03/2011 18:55, Chad Nelson wrote:
It isn't meant to be shared between threads. The internal code is single-threaded, and if you enable this on the public interface, you explicitly *dis*able the library's thread safety.
Sharing data between threads is not the same as accessing the same data concurrently from multiple threads.
In what way are those two things... two things? As far as I can tell they mean exactly the same.
I guess that statement doesn't really say what I meant it to say ;). I meant to say that there is no need for the primitives of the resource to be robust with regards to concurrent execution for the resource to be shared across threads, since of course the threads can be synchronized at a higher level to ensure that concurrent execution of any of the primitives does not happen.

At Fri, 04 Mar 2011 23:09:12 +0100, Mathias Gaunard wrote:
On 04/03/2011 21:15, Dave Abrahams wrote:
At Fri, 04 Mar 2011 19:20:43 +0100, Mathias Gaunard wrote:
On 04/03/2011 18:55, Chad Nelson wrote:
It isn't meant to be shared between threads. The internal code is single-threaded, and if you enable this on the public interface, you explicitly *dis*able the library's thread safety.
Sharing data between threads is not the same as accessing the same data concurrently from multiple threads.
In what way are those two things... two things? As far as I can tell they mean exactly the same.
I guess that statement doesn't really say what I meant it to say ;).
I meant to say that there is no need for the primitives of the resource to be robust with regards to concurrent execution for the resource to be shared across threads, since of course the threads can be synchronized at a higher level to ensure that concurrent execution of any of the primitives does not happen.
+1, thanks for clarifying. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Friday, March 04, 2011 18:19:45 Joel Falcou wrote:
No. you shield yourself from this by having a proper iterator/range types that hides said details of your internals. Code works with std::vector iterator everywhere and the concept iterator makes it possible to work even if w/e vendor decide to implement quantum information storage instead of contiguous one.
I believe that std::vector is required, these days, to implement continuous storage. Presumably, to take some flexibility from you ;-) - Volodya

On 25/03/11 09:02, Vladimir Prus wrote:
I believe that std::vector is required, these days, to implement continuous storage. Presumably, to take some flexibility from you ;-)
Well, yeah it is now required. let's pretend s/vector/std::something_else :p the iterator/range abstraction is still valid.

by the way i have an idea which will easily allow to expose internals of xint to satisfy demands of some reviewers say, we have a set of different backends lets call the current backend "sign_magnitude_beta" then everything can be exposed in the namespace hierarchy like this: xint::backend::sign_magnitude_beta::some_entity or xint::sign_magnitude_beta::backend::some_entity then when chad desides to switch to a differnent (e.g. faster) backend which may be called "sign_magnitude_v1" we got xint::backend::sign_magnitude_v1 _along_ with that "sign_magnitude_beta" on which some user code may be dependent this way the chance that xint lib breaks compatibility is almost zero and simultaneously it doesn't limit the development in any way eventually in some years the old (expired) backends may be removed -- Pavel P.S. if you notice a grammar mistake or weird phrasing in my message please point it out

On 03/04/2011 03:49 AM, Joel Falcou wrote:
The premises are OK. We have some large integer type that has proper value semantic and can be used in a variety of operations and functions. So far, so good. The main culprit is that the current library design is rather hardwired to work with a choice of implementation the author thought to be OK at some point but which still have to be proven better than existing solution.
The overall design lacks separation between the data and the algorithm that makes the library difficult to extend
To the extent that Boost is a proving ground for things suitable for possible inclusion in a future C++ standard, it seems to me that a review of something like XInt (in particular due to its general utility) ought to focus on the quality of the interface rather than a specific implementation. If the interface really does prescribe the implementation with a great deal of specificity, then in my view it should not be adopted as a library.
: what if i want to take an arbitrary range of integers and look at them as a huge series of bits ? What if i don't want the extra memory overhead so i can mmap some files directly into memory (cause reading a 10k bits integer is a no-no) ? Can I pass the integer type to a STD algorithm if I want to iterate over the bytes it contains ? or copy them using boost::range copy ?
Or stream the bits lazily, generating them as they are needed?
- use expression template to handle efficient combination of operations and potentially detect optimisable pattern. Proto is a shoe-in for this.
Is that not prescribing an implementation? A bigint facility that strained the limits of my compiler and pushed compile times through the roof is not useful to me.
- the algorithm should be more generic and not be tied to the concrete type of large integer used.
What about just working out a bigint concept with a straightforward implementation that permitted, but did not require, expression templates? XInt could be adapted to serve as a reference implementation and a frontend with selectable back-ends. - Marsh

On 04/03/11 16:20, Marsh Ray wrote:
To the extent that Boost is a proving ground for things suitable for possible inclusion in a future C++ standard, it seems to me that a review of something like XInt (in particular due to its general utility) ought to focus on the quality of the interface rather than a specific implementation.
I agree
If the interface really does prescribe the implementation with a great deal of specificity, then in my view it should not be adopted as a library.
I disagree. People also see boost as a repository of staet of the art way of implementing things. Library like XInt comes with an a priori expectations of performances, whcih, currently, it does not deliver.
Is that not prescribing an implementation?
It is. Now, i dont see why I should not express opinion on how to do something if said specifications actually helps the overall quality of the library. It is as prescribing as askign to use the n.log(n).log(n) algorithm instead of the trivial one for operator*
A bigint facility that strained the limits of my compiler and pushed compile times through the roof is not useful to me.
Are you compiling with MSVC6 or g++-2.95.2 ? Sarcasm aside, compile time is a problem for compiler vendor.
What about just working out a bigint concept with a straightforward implementation that permitted, but did not require, expression templates?
Expression enabled code is not something you add on top of another design like this. It is a concrete component of the design from the start.

On 04/03/2011 16:51, Joel Falcou wrote:
Expression enabled code is not something you add on top of another design like this. It is a concrete component of the design from the start.
I'm not so sure of that, can't it be built as a separate layer? I have to agree with Marsh's comment here; expression templates can be a problem for standardization or widespread use. I'm fine with a bigint library that doesn't do expression templates, as long as I can use it for the evaluation of my expressions.

On 04/03/11 17:27, Mathias Gaunard wrote:
On 04/03/2011 16:51, Joel Falcou wrote:
Expression enabled code is not something you add on top of another design like this. It is a concrete component of the design from the start.
I'm not so sure of that, can't it be built as a separate layer? you have to have your data struct and the terminal on top of them yes. What i meant is that ET enabled code may require some internals non-ET does not.
I have to agree with Marsh's comment here; expression templates can be a problem for standardization or widespread use. valarray is like the epytom of expression template sued in the standard ... so for me it doesnt hold.

----- "Joel Falcou" <joel.falcou@lri.fr> a écrit :
What about just working out a bigint concept with a straightforward implementation that permitted, but did not require, expression templates?
Expression enabled code is not something you add on top of another design like this. It is a concrete component of the design from the start.
If you bring expression enabled arithmetic algorithms, what prevents me to feed them with builtin integer types or current xint::integer_t? Ivan

On 03/04/2011 09:51 AM, Joel Falcou wrote:
On 04/03/11 16:20, Marsh Ray wrote:
If the interface really does prescribe the implementation with a great deal of specificity, then in my view it should not be adopted as a library.
I disagree. People also see boost as a repository of staet of the art way of implementing things. Library like XInt comes with an a priori expectations of performances, whcih, currently, it does not deliver.
Fair enough.
Is that not prescribing an implementation?
It is. Now, i dont see why I should not express opinion on how to do something if said specifications actually helps the overall quality of the library. It is as prescribing as askign to use the n.log(n).log(n) algorithm instead of the trivial one for operator*
Firstly, I believe in you expressing your opinion and value it. But it'd still be nice to have a standard bigint facility in C++ or Boost. But C++ seems like one of the last high-level languages without basic bigints at this point. We have regexes now for heaven's sake, integers were only conceived a few thousand years before! I'm concerned that perfect is the enemy of good enough here. Is it a requirement for Boost that every new library be state-of-the-art in its use of compile time polymorphism?
A bigint facility that strained the limits of my compiler and pushed compile times through the roof is not useful to me.
Are you compiling with MSVC6 or g++-2.95.2 ?
MSVC 8.0 GCC 4.4
Sarcasm aside, compile time is a problem for compiler vendor.
It's a problem for me, the poor developer, when I have to use the thing. Or not if I don't use it. But if no developers use it, the vendors don't make it a priority. Lots of numeric code is still written in C and Fortran. Whether that's because of the compiler or the mind, the abstraction penalty is real.
Expression enabled code is not something you add on top of another design like this. It is a concrete component of the design from the start.
OK. Maybe expression templates aren't a big issue, I think I'm using them happily with GMP's C++ interface. But I'd much rather be using something in boost:: or std::trX::. - Marsh

On 04/03/11 22:17, Marsh Ray wrote:
Firstly, I believe in you expressing your opinion and value it.
Thanks :)
But it'd still be nice to have a standard bigint facility in C++ or Boost. But C++ seems like one of the last high-level languages without basic bigints at this point. We have regexes now for heaven's sake, integers were only conceived a few thousand years before!
I'm concerned that perfect is the enemy of good enough here.
and I agree. Now, can we consider the proposal fits in the "good enough" ?
Is it a requirement for Boost that every new library be state-of-the-art in its use of compile time polymorphism?
Maybe, maybe not. Now, I remind you that Phoenix have been delayed as being accepted into boost the first time, consensus being it has to use proto to be accepted.
MSVC 8.0 GCC 4.4
Well, at when do you start feeling "it compiles for too long" ? Template heavy coe is liek other code, it has to be well written to be fast, and here, fast to compile.
It's a problem for me, the poor developer, when I have to use the thing. Or not if I don't use it. But if no developers use it, the vendors don't make it a priority. Lots of numeric code is still written in C and Fortran. Whether that's because of the compiler or the mind, the abstraction penalty is real.
It is real when said abstraction is sued willy-nilly.
OK. Maybe expression templates aren't a big issue, I think I'm using them happily with GMP's C++ interface. But I'd much rather be using something in boost:: or std::trX::.
Well, if XInt actually provided range based interface and an extension mechanism for external big int representation, we could have a good start.

On 03/04/2011 03:22 PM, Joel Falcou wrote:
On 04/03/11 22:17, Marsh Ray wrote:
I'm concerned that perfect is the enemy of good enough here.
and I agree. Now, can we consider the proposal fits in the "good enough" ?
I see two dimensions there: interface and implementation. Obviously interface is the place to expend a project's limited capacity to endure perfectionism. While implementations can improve beneath a stable interface, you seem to be saying that it's "verison 1.0 or never" for support of expression templates because they bundle the logic required for efficiency into the interface itself. It's been "never" so far for any standard C++ bigint facility. Perhaps two separate libraries are worth considering: one with old-fashioned objects and overloaded operators enhanced with rvalue references delivered soon, and another with all the compile-time magic it can stand to become available at some unspecified point in the future.
Is it a requirement for Boost that every new library be state-of-the-art in its use of compile time polymorphism?
Maybe, maybe not. Now, I remind you that Phoenix have been delayed as being accepted into boost the first time, consensus being it has to use proto to be accepted.
I promise to look at this Proto again soon.
MSVC 8.0 GCC 4.4
Well, at when do you start feeling "it compiles for too long" ? Template heavy coe is liek other code, it has to be well written to be fast, and here, fast to compile.
I admit I'm not terribly patient. I have some mature, stable projects that take several seconds per translation unit to compile. They crept up that way over time. But those projects are in maintenance mode and I don't have to add much creativity to them very often. I can handle it. But if I sit down to experiment, starting with with a brand new hello world project, and then I #include header file that puts a noticeable pause in my modify-compile-run cycle, then I am really reluctant to use it. If it adds time to hello world, how can I predict what it will cost if that project grows large? When the pauses creep over a few seconds, I start multitasking and task switching then imposes a disproportionate penalty. Thus I tend to favor Lex/Yacc over Spirit and possibly would stick with GMP over a proto-based solution. I do not pretend that that this is an objective criticism of a library's design, only that it's what works best for me. But I'm sure I'm not the only one who evaluates their libraries that way.
It's a problem for me, the poor developer, when I have to use the thing. Or not if I don't use it. But if no developers use it, the vendors don't make it a priority. Lots of numeric code is still written in C and Fortran. Whether that's because of the compiler or the mind, the abstraction penalty is real.
It is real when said abstraction is sued willy-nilly.
At the end of the day, it's all Turing-equivalent. There remain C and Fortran programmers who haven't been converted by the actual performance gains theoretically enabled by template compile-time polymorphism. Future supercomputers are as likely to prefer some derivative of OpenCL or CUDA as they are C++. I think 'willy-nilly' makes a wonderful bottom line standard to avoid, but a metaprogramming-heavy design is still seems very much a matter of personal style and preference. In my opinion (which I'm willing to change as long as it doesn't add too much to my compile times. :-)
Well, if XInt actually provided range based interface and an extension mechanism for external big int representation, we could have a good start.
There we go. What would it look like? Adapters on standard container<integral type> concepts? How long until I can use it at a beta-level of interface stability? - Marsh

On 04/03/11 23:35, Marsh Ray wrote:
While implementations can improve beneath a stable interface, you seem to be saying that it's "verison 1.0 or never" for support of expression templates because they bundle the logic required for efficiency into the interface itself. I requires some adaptation and planning to fully see where the lazyness as to play a role.
It's been "never" so far for any standard C++ bigint facility.
Perhaps two separate libraries are worth considering: one with old-fashioned objects and overloaded operators enhanced with rvalue references delivered soon, and another with all the compile-time magic it can stand to become available at some unspecified point in the future. Maybe. I admit I'm not terribly patient.
I have some mature, stable projects that take several seconds per translation unit to compile. They crept up that way over time. But those projects are in maintenance mode and I don't have to add much creativity to them very often. I can handle it.
But if I sit down to experiment, starting with with a brand new hello world project, and then I #include header file that puts a noticeable pause in my modify-compile-run cycle, then I am really reluctant to use it. If it adds time to hello world, how can I predict what it will cost if that project grows large?
When the pauses creep over a few seconds, I start multitasking and task switching then imposes a disproportionate penalty.
Well, i can see that. But what if the compilation cost comes with some garanteed performances ?
Thus I tend to favor Lex/Yacc over Spirit and possibly would stick GMP over a proto-based solution. Even for performances improvement or better code maintenance ? At the end of the day, it's all Turing-equivalent. There remain C and Fortran programmers who haven't been converted by the actual performance gains theoretically enabled by template compile-time polymorphism. Future supercomputers are as likely to prefer some derivative of OpenCL or CUDA as they are C++. Or not, really ...
What would it look like? Adapters on standard container<integral type> concepts? How long until I can use it at a beta-level of interface stability? Depends of the author really. As I said i am not expert in this very specific area but as I understand it it is a matter of changing the granualrity of the elements used to store the piece of the bigger int and how they are stored in memory. If this part of the interface coudl be made geenric enough, we could even have bigint implemented on top of fusion vector or mpl sequence of CT integers or SIMD vector.

Marsh Ray wrote:
On 03/04/2011 03:22 PM, Joel Falcou wrote:
On 04/03/11 22:17, Marsh Ray wrote:
I'm concerned that perfect is the enemy of good enough here.
and I agree. Now, can we consider the proposal fits in the "good enough" ?
Perhaps two separate libraries are worth considering: one with old-fashioned objects and overloaded operators enhanced with rvalue references delivered soon, and another with all the compile-time magic it can stand to become available at some unspecified point in the future.
+1
Well, at when do you start feeling "it compiles for too long" ? Template heavy code is like other code, it has to be well written to be fast, and here, fast to compile.
If I sit down to experiment, starting with a brand new hello world project, and then I #include header file that puts a noticeable pause in my modify-compile-run cycle, then I am really reluctant to use it. If it adds time to hello world, how can I predict what it will cost if that project grows large?
When the pauses creep over a few seconds, I start multitasking and task switching then imposes a disproportionate penalty.
Thus I tend to favor Lex/Yacc over Spirit and possibly would stick with GMP over a proto-based solution.
This is a legitimate concern, but don't forget the benefits that accrue to single tool solutions. The extra effort to create and maintain a Lex/Yacc solution is a drag on productivity, too.
I think 'willy-nilly' makes a wonderful bottom line standard to avoid, but a metaprogramming-heavy design still seems very much a matter of personal style and preference.
Indeed and it is easy to overuse it. In part it comes from a sense of superiority as relatively few understand it well. That's not to say there aren't real benefits from TMP.
Well, if XInt actually provided range based interface and an extension mechanism for external big int representation, we could have a good start.
What would it look like? Adapters on standard container<integral type> concepts?
Even with such a design, there's need for an xint::integer type to make its use simple when the flexibility isn't needed.
How long until I can use it at a beta-level of interface stability?
Exactly. I think xint::integer can be accepted and later layered atop what Joel describes. Accepting XInt doesn't need to preclude the generic design. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

AMDG On 03/04/2011 01:22 PM, Joel Falcou wrote:
On 04/03/11 22:17, Marsh Ray wrote:
Is it a requirement for Boost that every new library be state-of-the-art in its use of compile time polymorphism?
Maybe, maybe not. Now, I remind you that Phoenix have been delayed as being accepted into boost the first time, consensus being it has to use proto to be accepted.
Phoenix is a very different case. a) We already had Lambda which provides the same basic functionality. b) The choice for phoenix was between using expression templates with proto or without proto. The choice here is between using expression templates with proto and not using expression templates at all. In Christ, Steven Watanabe

Steven Watanabe wrote:
AMDG
On 03/04/2011 01:22 PM, Joel Falcou wrote:
On 04/03/11 22:17, Marsh Ray wrote:
Is it a requirement for Boost that every new library be state-of-the-art in its use of compile time polymorphism?
Maybe, maybe not. Now, I remind you that Phoenix have been delayed as being accepted into boost the first time, consensus being it has to use proto to be accepted.
Phoenix is a very different case.
a) We already had Lambda which provides the same basic functionality. b) The choice for phoenix was between using expression templates with proto or without proto. The choice here is between using expression templates with proto and not using expression templates at all.
This is a serious question that I'm posing with all humility. What benefit does proto provide in this case? What do we "get" for using it and what does it "cost". Can you sketch a comparison between an expression template operator+ on some concrete big int type both with and without use of Proto and walk us through how the use of proto is helpful and worthwhile? I can already reason that compile time with Proto will be higher than without, though I would be convinced by benchmarks to the contrary. We don't need concept checking in this case, so what does proto buy us? Regards, Luke

On 09/03/11 19:14, Simonson, Lucanus J wrote:
This is a serious question that I'm posing with all humility. What benefit does proto provide in this case? What do we "get" for using it and what does it "cost". Can you sketch a comparison between an expression template operator+ on some concrete big int type both with and without use of Proto and walk us through how the use of proto is helpful and worthwhile? I can already reason that compile time with Proto will be higher than without, though I would be convinced by benchmarks to the contrary. We don't need concept checking in this case, so what does proto buy us? The fact that you dont have to maintain clunky and possibly inavsive ET core code. Writing the operator thingy to be all ET-enabled is a real mess and seriously, when you did it a couple of times by hand and did once with proto, you dont want to go back.
More over the cocnept of grammar and transform make your ET intent more clear. You can have multiples transform aimed at different pass inyou r ASt evalaution and have them cleanly separated fromt he main ET core code. transforms are also interchangeable and can be applied in a variety of context, support a nice and intutive recursive matching notation. As for application, matching pattern of successive operators is a couple of lines in proto and is library agnostic. Writing a pattern matcher for arbitrary hand written ET code is not that trivial. For compile, it's always the saem. if you write crap proto it will be crap. All my proto based code has a non significant upfront compile overhead, and well, it comes from the preprocessing around and not proto itself.

Joel Falcou wrote:
On 09/03/11 19:14, Simonson, Lucanus J wrote:
This is a serious question that I'm posing with all humility. What benefit does proto provide in this case? What do we "get" for using it and what does it "cost". Can you sketch a comparison between an expression template operator+ on some concrete big int type both with and without use of Proto and walk us through how the use of proto is helpful and worthwhile? I can already reason that compile time with Proto will be higher than without, though I would be convinced by benchmarks to the contrary. We don't need concept checking in this case, so what does proto buy us? The fact that you dont have to maintain clunky and possibly inavsive ET core code. Writing the operator thingy to be all ET-enabled is a real mess and seriously, when you did it a couple of times by hand and did once with proto, you dont want to go back.
More over the cocnept of grammar and transform make your ET intent more clear. You can have multiples transform aimed at different pass inyou r ASt evalaution and have them cleanly separated fromt he main ET core code. transforms are also interchangeable and can be applied in a variety of context, support a nice and intutive recursive matching notation.
As for application, matching pattern of successive operators is a couple of lines in proto and is library agnostic. Writing a pattern matcher for arbitrary hand written ET code is not that trivial. For compile, it's always the saem. if you write crap proto it will be crap. All my proto based code has a non significant upfront compile overhead, and well, it comes from the preprocessing around and not proto itself.
Granted, proto has a number of very nice features. The question is, would xint benefit from them? Does this library require transforms and or multiple passes? I certainly hope not. The default operator precidence happens to the one we want in this case (as far as I understand) so what is the point of defining a grammar for operator expressions to achieve what would happen by default without one? Please correct me if I am wrong. Regards, Luke

Granted, proto has a number of very nice features. The question is, would xint benefit from them? Does this library require transforms and or multiple passes? I certainly hope not. The default operator precidence happens to the one we want in this case (as far as I understand) so what is the point of defining a grammar for operator expressions to achieve what would happen by default without one?
Please correct me if I am wrong.
For me the inetrest if on function composition optimization. Let's say you have the ln and gamma function operating on your type (silly example but take any too function who ends up with a large gap of bits for representing the reuslt). Calling ln(gamma(x)) in a regular library will give you a quite large value (and reuqire slotsa bits) and you'll shrink it through the log, havign allcoated new bits for no real gain. Now make it proto based and have a ruel matching tree of (ln -- gamma -- x) and transform it into gammaln(x) with a better, less bits intensive version. Less bits allocated, more pecise result. This is some example and I think a substantial amount cna be found. Now, make it properly so you can add special case liek that outside and you have an extensible optimization point.

Joel Falcou wrote:
Granted, proto has a number of very nice features. The question is, would xint benefit from them? Does this library require transforms and or multiple passes? I certainly hope not. The default operator precidence happens to the one we want in this case (as far as I understand) so what is the point of defining a grammar for operator expressions to achieve what would happen by default without one?
Please correct me if I am wrong.
For me the inetrest if on function composition optimization.
Let's say you have the ln and gamma function operating on your type (silly example but take any too function who ends up with a large gap of bits for representing the reuslt).
Calling ln(gamma(x)) in a regular library will give you a quite large value (and reuqire slotsa bits) and you'll shrink it through the log, havign allcoated new bits for no real gain.
Now make it proto based and have a ruel matching tree of (ln -- gamma -- x) and transform it into gammaln(x) with a better, less bits intensive version. Less bits allocated, more pecise result.
This is some example and I think a substantial amount cna be found. Now, make it properly so you can add special case liek that outside and you have an extensible optimization point.
I agree that something like that could be done, and there are plenty of other AST transformations we would make to improve even simple arithmetic expressions written by the user, but I'm not convinced we should. If the user is able to write the optimal expression himself then my view is that a library does not need to optimize the user's suboptimal expressions. If the user sees a problem with the expression he wrote (uses too much memory, runs too long) he can fix it himself. As a user I would not expect the library to optimize my expressions for me, nor would I trust it to do so even if it claimed in its documentation that it does unless I understood the details of what it can or cannot do, which isn't something I would want to invest time in. We need to set a reasonable expectation for what the library should do. These exact same types of optimizations could be applied to my Polygon library expression templates, but I judged it to be not worthwhile. The common case for use of the expression template is a single operation and even in complex expressions that don't get broken across multiple statements the opportunity for improving on what was written is the exception, not the rule, and easily done manually by the user. Implementing a compiler as a template metaprogram is exactly what I want to avoid in this case. Every user will pay the compilation cost imposed by the abstraction, while few will benefit and those most in need of benefit are the ones who will optimize their expression by hand and benefit least. Regards, Luke

I agree that something like that could be done, and there are plenty of other AST transformations we would make to improve even simple arithmetic expressions written by the user, but I'm not convinced we should. If the user is able to write the optimal expression himself then my view is that a library does not need to optimize the user's suboptimal expressions. If the user sees a problem with the expression he wrote (uses too much memory, runs too long) he can fix it himself. As a user I would not expect the library to optimize my expressions for me, nor would I trust it to do so even if it claimed in its documentation that it does unless I understood the details of what it can or cannot do, which isn't something I would want to invest time in. he, compositionnal repalcement is nowhere from hard to expect. We need to set a reasonable expectation for what the library should do. These exact same types of optimizations could be applied to my Polygon library expression templates, but I judged it to be not worthwhile. The common case for use of the expression template is a single operation and even in complex expressions that don't get broken across multiple statements the opportunity for improving on what was written is the exception, not the rule, and easily done manually by the user. Implementing a compiler as a template metaprogram is exactly what I want to avoid in this case. Every user will pay the compilation cost imposed by the abstraction, while few will benefit and those most in need of benefit are the ones who will optimize their expression by hand and benefit least.
I never say that, it doesn't do any meta-compilation, just rearranging of trees before evaluation. And again your fear of compilation cost are vastly exagerated ... Future of library design is exactly the opposite, leveragign details from user hands and implement them as active mechanism runnign at CT or RT dependign ont he grain and scope of said operation. The more intentional and wrapped an interface can be the better. We now have the tools to make the thingy implement itself automatically dependign on a alrge scope of factor so I really dont see the point of pushign this design forward. Are we Boost, the state of the art C++ lirbary or Boost the overly conservative library ?

AMDG On 03/09/2011 12:17 PM, Joel Falcou wrote:
Are we Boost, the state of the art C++ lirbary or Boost the overly conservative library ?
It really isn't necessary to use every fancy tool just because it's there. There is absolutely nothing wrong with plain old value types. In Christ, Steven Watanabe

Joel Falcou wrote:
I agree that something like that could be done, and there are plenty of other AST transformations we would make to improve even simple arithmetic expressions written by the user, but I'm not convinced we should. If the user is able to write the optimal expression himself then my view is that a library does not need to optimize the user's suboptimal expressions. If the user sees a problem with the expression he wrote (uses too much memory, runs too long) he can fix it himself. As a user I would not expect the library to optimize my expressions for me, nor would I trust it to do so even if it claimed in its documentation that it does unless I understood the details of what it can or cannot do, which isn't something I would want to invest time in. he, compositionnal repalcement is nowhere from hard to expect. We need to set a reasonable expectation for what the library should do. These exact same types of optimizations could be applied to my Polygon library expression templates, but I judged it to be not worthwhile. The common case for use of the expression template is a single operation and even in complex expressions that don't get broken across multiple statements the opportunity for improving on what was written is the exception, not the rule, and easily done manually by the user. Implementing a compiler as a template metaprogram is exactly what I want to avoid in this case. Every user will pay the compilation cost imposed by the abstraction, while few will benefit and those most in need of benefit are the ones who will optimize their expression by hand and benefit least.
I never say that, it doesn't do any meta-compilation, just rearranging of trees before evaluation. And again your fear of compilation cost are vastly exagerated ...
Future of library design is exactly the opposite, leveragign details from user hands and implement them as active mechanism runnign at CT or RT dependign ont he grain and scope of said operation. The more intentional and wrapped an interface can be the better. We now have the tools to make the thingy implement itself automatically dependign on a alrge scope of factor so I really dont see the point of pushign this design forward. Are we Boost, the state of the art C++ lirbary or Boost the overly conservative library ?
I agree with you about the future of library development, but I'm not convinced that the future is now. If we had C++ concepts and (implicit) access to optimization algorithms implemented by the compiler for types that model concepts the compiler knows how to optimize *AND* the ability to write libraries that define new concepts and implement optimizations thereof to extend the compiler then I would say the future is now. Right now it looks like the future is C++3x ratified in 2041. So do we implement constant propigation and common sub expresion elimination in a general way using proto to optimize expressions of multi-precision integers that are confined to a single statement? Constant propigation would require we implement infinite precision integer algorithms at compile time. We probably don't want to do that, so lets just take common sub expression elimination. Oops, we can't really tell if the arguments of an expression are the same or not in our template metaprogram, only what their types are, so we can't even implement common sub expression elimination. I guess your right, we can't implement a compiler using proto and would be wise not to try in the instances where it is technically possible but highly impractical. What we are left with is the ability to implement only a patchwork of half measures that prove that that sort of thing can be done, but that wouldn't really add much value to the library. As a library user I'm not interested in the technical reason why it can't implement basic optimization passes, only that it fails to do the obvious optimizations that I could easily do manually. What you need to do this right in C++ we have now is a JIT compiler implemented in a library, not proto. While we are at it we can use that JIT compiler to SSE accelerate the algorithms and then dispatch them to many cores or even the GPU. We may be boost, but we should know our limitations. Regards, Luke

as i dont like words being put in my mouth, i'll let the discussion here. More can be done than these contrived examples that will obviously fail but i have other stuff to do than arguing on the internet. So, I restate my position. I still think ET enabled XInt has added value, now, if writing ET generator makes you all fuzzy inside then great, i'll keep being productive not rewriting them every friday afternoon.

On Friday, March 04, 2011 04:20:33 PM Marsh Ray wrote:
A bigint facility that strained the limits of my compiler and pushed compile times through the roof is not useful to me.
Can't resist: http://imgs.xkcd.com/comics/compiling.png
participants (13)
-
Chad Nelson
-
Christopher Jefferson
-
Dave Abrahams
-
Ivan Le Lann
-
Joel Falcou
-
Marsh Ray
-
Mathias Gaunard
-
pavel
-
Simonson, Lucanus J
-
Steven Watanabe
-
Stewart, Robert
-
Thomas Heller
-
Vladimir Prus