
The storage of these types is just the actual numeric data, so I should be able to do binary I/O on them etc., which is good. I would perhaps be a bit more confident of this if it were just an array of int32s or int64s, rather than the current recursive organisation.
I understand your trepidation, but really there should be no reason why binary I/O would not work provided that the bottom-level types are built-in integers or an equivalent, of the wordsize or greater. If smaller integer types are mixed-in, the compiler could add padding. There is also the question of big or little endian, which will affect the needed declaration order.
I think chasing binary IO is a fools errand - if you expect the result to be compatible with builtin integers that is: 1) There's no guarantee that built in types are twos complement. 2) Larger than a single register integers are synthesised by the compiler - in effect these are compiler supplied "big ints" - as such both the byte order and limb order are completely arbitrary. For example on a 16 bit processor (the simplest case), there are two choices for the byte order within a 16 bit word, and then another two choices for the word order within compiler synthesised 32-bit type. Unless you're going to write and test code that handles all possibilities, and figuire out how to configure it so it does the right thing on all platforms (including all the weird 8 and 16 bit embedded platforms out there). If you're sensible and don't shoot for that, then there's no difference between the Multiprecision fixed int's and 2's complement ones: both are POD's and can be treated as a platform and compiler specific "bag of bits".
You're using 2's complement, which I believe is the right choice.
Agreed. It both makes the calculations simpler and makes large_int closer to the built-in types.
Not so fast: I started off using 2's complement arithmetic for fixed integer types and was persuaded to change to sign-magnitude by some tests the Boost.Polygon guys came up with. Consider multiplication - for any sensible implementation for smallish int's the complexity is O(N^2), for the Multiprecision lib's fixed int's, N is the number of limbs that are actually used. For a packed 2's complement type, N is the total number of limbs in the representation. That means that: * For the Multiprecision lib, runtime of multiplication is independent of the size of the type - it depends only on the magnitutde of the number being stored - so for example 2 * 3 takes (roughly) the size of the type used - whether it's a 128, 256 or 2048-bit int. * In contrast for a packed 2's complement type the complexity of multiplication and division grows steaply with increasing bit size. Now having said that, the sign-magnitude version is more complex, and more expensive to construct and copy - not by much - but it shows up in tests such as the delaunay code that Phil posted where a fair proportion of time is spent constructing the large int's just for one quick calculation. It's an important use case though, and one I need to come back to. BTW, I've mentioned this a couple of times but it seems to have slipped under the radar - the Multiprecision lib already has a packed 2's complement fixed int back end, it's just that I removed it for the review because I didn't think it added much. <Shrug> I guess.
Consider a 256-bit int implemented using 64-bit built-in ints via inter- mediate 128-bit ints ... All of second half of that is redundant once you know that the least significant 64 bits incremented to a non-zero value ... if the implementation were just an array of 4 (u)int64_ts, you would iterate with a for loop and break when non-zero.
This is another thing that would have to be tested. Unfortunately it's nigh-on impossible to know whether a version with or without more branching ('if' or 'for') would be faster without trying it, and even then on multiple target architectures.
I fear Phil is asking the impossible here - you can have something really simple which is fast in the trivial case (one stop larger than long long) but rapidly slows for larger cases, or you can do what the Multiprecision lib does and add all the special case handling which can give order-of-magnitude improvements for 1024 bits and up, but slows things down due to more branching by ~50% in the trivial case. Sometimes you just have to make a choice.
* Object sizes - you seem to be adding a pointer to all integers (presumably as a link between the front and back ends).
No, no such pointers anywhere. As mentioned about there is a packed 2's complement fixed int backend in the sandbox, it just wasn't part of the review (and yes it's size is exactly N bits, no pointers or other stray members).
* This also leads onto forward compatibility and interoperability. If we were
I don't believe so - remember that "larger than single register" types are synthesized by the compiler, so the limb order they use is completely dependent upon the whim of the compiler writer. Thinking you can be forwards compatible with types that haven't been written yet is a fools errand, even if one would hope that compiler writers all do something sensible ;-) John. PS will investigate the assertion you spotted later when I can get 64-bit Linux back up...