
On Thu, 8 Jun 2006 13:29:02 -0400, Kim Barrett <kab@irobot.com> wrote:
At 12:51 PM -0400 6/8/06, Beman Dawes wrote:
Gennaro Prota wrote:
* there is no guarantee that an unsigned char has 8 bits...
The C and C++ standards specify char, signed char, and unsigned char all have exactly 8 bits, AFAIK.
CHAR_BITS is defined to be *at least* 8 bits. No guarantee that it is *exactly* 8 bits.
Right.
This is not just a historical artifact to support strange ancient processors with odd addressing unit sizes either. There are modern C/C++ implementations for modern DSP processors where, for example, sizeof(char) == sizeof(int) == 1, and CHAR_BITS is 16 or 32 (or perhaps even 64, though I haven't actually run across that last case myself).
Historically there have been implementations for machines with a 36-bit word size, where CHAR_BIT == 9 was chosen. This way, four chars were packed into one machine word, and a pointer to char actually consisted of a pointer to a machine word plus an offset (0, 1, 2, 3). Of course a pointer to int just consisted of a machine pointer. This is basically the reason why the standard allows sizeof(char *) > sizeof(int *)
Of course, the vast majority of even purportedly portable code ignores this fact, because it can be a real PITA to deal with, usually for little or no benefit.
It actually depends on the context. In some cases it is difficult, in some others it's just a matter of avoiding to hardcode a constant. FWIW, dynamic_bitset<>::count() also works on platforms where CHAR_BIT
8, by selecting a different implementation at compile time. Incidentally, it also takes into account the possibility of padding bits in the representation of integer types; do you know of any implementation that has these?
--Gennaro.