
22 Oct
2004
22 Oct
'04
12:52 a.m.
yes I do realise... the origonal statement was "...everybody agreed characters outside 16 bits are very rare, UTF-32 seems to never be needed." UTF-16 can indeed represent every Unicode character, but that is not what was written. It must have been obvious to the poster of the original statement.
One point that hasn't been mentioned so far is that, word sizes on most modern CPU's are 32bits wide. From a performance POV, the word-alignment may be a suitable justification for offsetting the increased storage requirements of a 32bit unit. Of course from a performance POV you surely don't want to waste twice (four times) as much memory in the cache either. Performance is always a tradeoff.
Cheers, Michael