Re: [boost] Algebraic abstractions related to (big) integers in C++

27 Mar 2011

      On 3/27/11 00:21 Thomas Klimpel wrote:
...
The goal of this message is to reflect on the two's complement representation of negative integers, from a relatively abstract algebraic perspective.
I suspect that practical arguments may resonate more deeply with this
list than theoretical ones.  Indeed, from a theoretical point of view
it's trivial to convert operations on a sign bit + N-1 magnitude bit
representation to/from operations on the N-bit 2's complement
representation, and in fact there is one value representable in the
two's complement representation that is not in the sign/magnitude
representation of the same length, which seems like a (very minor)
theoretical win for 2's complement.

If the operands are the same size (even if that spans multiple machine
words), I think 2's complement is clearly preferable to take advantage
of the hardware instruction set.  With the exception of the overflow
flag, addition, subtraction, and the low-order result of a
multiplication yield the same bitstring whether the operands (and
result) are taken to be 2's complement or unsigned, which means that
you can perform arithmetic on lower-order word-size chunks of a large
integer without caring whether the highest-order chunk is positive or
negative.  Might as well take advantage of the instruction-level
building blocks for 2's complement representation (with appropriate
overflow, carry-in, etc. options) in this case.

If you might be adding or subtracting large integers of different
length, the sign-magnitude representation is arguably simpler than
sign-extending the smaller operand (and potentially propagating
carry-outs up to the highest-order chunk of the result.)  This then
lets you choose to represent negative numbers in the smallest possible
number of words without worrying about an arithmetic performance
tradeoff.  On the other hand, multiword sign-magnitude arithmetic
requires two code paths for every operation, with an initial
test-and-branch to determine whether all your ADDs should be replaced
by SUBs.  And given that 2's complement seems preferable for
fixed-width arithmetic, I think I'd want a library supporting both
fixed- and variable-width representations to adopt the same convention
for both.
...
"What I consider especially strange from an algebraic point of view, is to automatically promote a signed integer type to its corresponding unsigned integer type, when >an arithmetic expression contains both signed and unsigned integer types. Shouldn't it be the other way round? Or wouldn't it be even better, if there would be no >automatic conversion at all? At least from an algebraic point of view, any natural number is also an integer, not vice versa like in C."
To be precise, this happens only when the rank of the unsigned type is
greater than or equal to that of the signed type.  In the case where
both operands have non-negative values, this minimizes the risk of
overflow during addition or multiplication.  When performing an
operation on a unsigned value and a signed value (or subtracting a
unsigned value from a signed positive value of lesser magnitude), it
is incumbent upon the programmer to have some notion of what he or she
is doing.  I believe the rationale may be that signed values are the
default, so if you went out of your way to declare something as
unsigned then you're accepting some responsibility for using it
appropriately.  Note also that (at least in ANSI C) when performing
arithmetic between an unsigned value and a signed value of higher
rank, the promotion is to the signed type--- so the "promote to the
'larger' type" principle is consistently applied across the board,
particularly if you consider unsigned types "larger" than the signed
equivalent (seems reasonable to me given how uncommon negative values
are in most integer code.)

Re: [boost] Algebraic abstractions related to (big) integers in C++

Brent Spillner