Re: [boost] [endian] Refresh based on comments received

8 Jun 2006


      On Thu, 8 Jun 2006 13:29:02 -0400, Kim Barrett <kab@irobot.com> wrote:
...
At 12:51 PM -0400 6/8/06, Beman Dawes wrote:
...
Gennaro Prota wrote:
...
* there is no guarantee that an unsigned char has 8 bits...
The C and C++ standards specify char, signed char, and unsigned char all
have exactly 8 bits, AFAIK.
CHAR_BITS is defined to be *at least* 8 bits. No guarantee that it is
*exactly* 8 bits.
Right.
...
This is not just a historical artifact to support
strange ancient processors with odd addressing unit sizes either. There
are modern C/C++ implementations for modern DSP processors where, for
example, sizeof(char) == sizeof(int) == 1, and CHAR_BITS is 16 or 32
(or perhaps even 64, though I haven't actually run across that last
case myself).
Historically there have been implementations for machines with a
36-bit word size, where CHAR_BIT == 9 was chosen. This way, four chars
were packed into one machine word, and a pointer to char actually
consisted of a pointer to a machine word plus an offset (0, 1, 2, 3).
Of course a pointer to int just consisted of a machine pointer. This
is basically the reason why the standard allows

  sizeof(char *) > sizeof(int *)
...
Of course, the vast majority of even purportedly portable code ignores
this fact, because it can be a real PITA to deal with, usually for little
or no benefit.
It actually depends on the context. In some cases it is difficult, in
some others it's just a matter of avoiding to hardcode a constant.
FWIW, dynamic_bitset<>::count() also works on platforms where CHAR_BIT
...
8, by selecting a different implementation at compile time.
Incidentally, it also takes into account the possibility of padding
bits in the representation of integer types; do you know of any
implementation that has these?
--Gennaro.