
19 Jan
2011
19 Jan
'11
7:20 p.m.
On Wed, Jan 19, 2011 at 18:33, Peter Dimov <pdimov@pdimov.com> wrote:
This was the prevailing thinking once. First this number of bits was 16, which incorrect assumption claimed Microsoft and Java as victims, then it became 21 (or 22?). Eventually, people realized that this will never happen even if we allocate 32 bits per character, so here we are.
This is one more advantage of UTF-8 over UTF-16 and UTF-32. UTF-8 bit patterns can be extended indefinitely, even for 256 bit code-points. :-)