Re: [boost] Call for interest for native unicode character

Message: 2 Date: Sat, 23 Jul 2005 19:32:03 -0400 From: Ben Artin <macdev@artins.org> Subject: Re: [boost] Call for interest for native unicode character and string support in boost To: boost@lists.boost.org Message-ID: <macdev-DD69FE.19320323072005@sea.gmane.org> Dear Ben, Thanks for the reference, I have already taken a look at the previous posts. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1040.pdf does mention the primitive types. I see that vector is a preference as a container - however I don't see any handling of graphemes, words, lines, and sentences which are integral parts on the Unicode 4 specification. Nor do I see any proposal for how to integrate the actual Unicode data which is vital to being able to implement Unicode iterators, sort, upper/lower case conversion, and the rest.... I have not seen anything on providing the necessary support for character information e.g. to convert logical to display order. This data even when optimised will still be several meg of data - but a Unicode implementation cannot fly without it. Nor do I see how you can ensure that if you base Unicode strings on vector that you can ensure that the contents is a Unicode string [e.g. adding the first member of a surrogate only and not adding the second member]. Has the conversation moved on? Where is this currently being discussed? Thanks. Yours, Graham Barnett BEng, MCSD/ MCAD .Net, MCSE/ MCSA 2003, CompTIA Sec+

In article <086E419469537C439E250E428F0DF0930162A3@host.Sysdev.local>, "Graham" <Graham@system-development.co.uk> wrote:
Nor do I see how you can ensure that if you base Unicode strings on vector that you can ensure that the contents is a Unicode string [e.g. adding the first member of a surrogate only and not adding the second member].
That depends on what you mean by "base on vector". To me, basing Unicode strings on vectors means that vectors are an implementation detail and that the client of the Unicode string abstraction is not allowed to make arbitrary changes to the vector. (In other words, the Unicode string abstraction maintains all the invariants pertaining to Unicode.) Ben -- I changed my name: <http://periodic-kingdom.org/People/NameChange.php>
participants (2)
-
Ben Artin
-
Graham