
25 Apr
2011
25 Apr
'11
6:55 p.m.
From: Ryou Ezoe <boostcpp@gmail.com>
Sort by code point is not the best solution. But at least, it's consistent if we use one encoding.
No it is not, UCS encoding has different order in different representations: UTF-8 and UTF-32 order is consistent i.e. for each a,b in utf8(a) < utf8(b) iff utf32(a) < utf32(b) However this is not correct for UTF-16 where codepoints outside of BMP has different ordering. i.e. It may be that codepoint (a) > codepoint(b) but UTF-16(a) sorted before UTF-16(b) Artyom