Re: [boost] [rfc] Unicode GSoC project

15 May 2009


      Scott McMurray wrote:
...
I really think UTF-8 should be the recommended one, since it forces
people to remember that it's no longer one unit, one "character".
Even in Beman Dawes's talk
(http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009/05/07/...)
where slide 11 mentions UTF-32 and remembers that UTF-16 can still
take 2 encoding units per codepoint, slide 13 says that UTF-16 is
"desired" where "random access critical".
I don't plan on supporting random access for UTF-16.
UTF-16 is still faster than UTF-8 because UTF-8 requires more complex 
decoding.
UTF-16 has only two cases, making it easier to optimize branches under 
the likely and unlikely case.

Re: [boost] [rfc] Unicode GSoC project

Mathias Gaunard