
On 3/27/11 00:21 Thomas Klimpel wrote:
The goal of this message is to reflect on the two's complement representation of negative integers, from a relatively abstract algebraic perspective.
I suspect that practical arguments may resonate more deeply with this list than theoretical ones. Indeed, from a theoretical point of view it's trivial to convert operations on a sign bit + N-1 magnitude bit representation to/from operations on the N-bit 2's complement representation, and in fact there is one value representable in the two's complement representation that is not in the sign/magnitude representation of the same length, which seems like a (very minor) theoretical win for 2's complement. If the operands are the same size (even if that spans multiple machine words), I think 2's complement is clearly preferable to take advantage of the hardware instruction set. With the exception of the overflow flag, addition, subtraction, and the low-order result of a multiplication yield the same bitstring whether the operands (and result) are taken to be 2's complement or unsigned, which means that you can perform arithmetic on lower-order word-size chunks of a large integer without caring whether the highest-order chunk is positive or negative. Might as well take advantage of the instruction-level building blocks for 2's complement representation (with appropriate overflow, carry-in, etc. options) in this case. If you might be adding or subtracting large integers of different length, the sign-magnitude representation is arguably simpler than sign-extending the smaller operand (and potentially propagating carry-outs up to the highest-order chunk of the result.) This then lets you choose to represent negative numbers in the smallest possible number of words without worrying about an arithmetic performance tradeoff. On the other hand, multiword sign-magnitude arithmetic requires two code paths for every operation, with an initial test-and-branch to determine whether all your ADDs should be replaced by SUBs. And given that 2's complement seems preferable for fixed-width arithmetic, I think I'd want a library supporting both fixed- and variable-width representations to adopt the same convention for both.
"What I consider especially strange from an algebraic point of view, is to automatically promote a signed integer type to its corresponding unsigned integer type, when >an arithmetic expression contains both signed and unsigned integer types. Shouldn't it be the other way round? Or wouldn't it be even better, if there would be no >automatic conversion at all? At least from an algebraic point of view, any natural number is also an integer, not vice versa like in C."
To be precise, this happens only when the rank of the unsigned type is greater than or equal to that of the signed type. In the case where both operands have non-negative values, this minimizes the risk of overflow during addition or multiplication. When performing an operation on a unsigned value and a signed value (or subtracting a unsigned value from a signed positive value of lesser magnitude), it is incumbent upon the programmer to have some notion of what he or she is doing. I believe the rationale may be that signed values are the default, so if you went out of your way to declare something as unsigned then you're accepting some responsibility for using it appropriately. Note also that (at least in ANSI C) when performing arithmetic between an unsigned value and a signed value of higher rank, the promotion is to the signed type--- so the "promote to the 'larger' type" principle is consistently applied across the board, particularly if you consider unsigned types "larger" than the signed equivalent (seems reasonable to me given how uncommon negative values are in most integer code.)