
On 01/19/2011 11:54 AM, Chad Nelson wrote:
On Wed, 19 Jan 2011 09:58:13 -0500 Edward Diener<eldiener@tropicsoft.com> wrote:
I am a believer ;) and when people realize that UTF-8 is the way to go, the pesky problems will vanish. Believe me today with ANSI I do not believe that UTF-8 is the way to go. In fact I know it is not, except perhaps for the very near future for some programmers ( Linux advocates ).
Inevitably a Unicode standard will be adapted where every character of every language will be represented by a single fixed length number of bits. [...] I'm no Unicode expert, but the reason this hasn't happened might be combinatorial explosion. In which case it might never happen. But I could well be wrong. And I hope I am, the design you outline is something I'd love to see. It's already here and has been for a long time. That's just UCS encoded as UTF-32. UCS isn't a new thing. They started on the standard in the late 80s and the standard was first copyright in 1991. They've come a long way. All the common languages and many of the uncommon languages are supported. Already many dead languages are supported. Language with supported added in 5.1 and 5.2 were Cham, Kayah Li, Lepcha, Ol Chiki, Rejang, Saurashtra, Sundanese, Vai, Bamum, Javanese, Lisu, Meetei Mayek, Samaritan, Tai Tham, and Tai Viet.
Patrick