[boost] Re: [Unicode strings] We're off

16 Mar 2005

      Just wanted to add some things I forgot in the other mail.

I'd like to stress that none of the code you see in the current 
implementation should be concidered production quality. There are a lot 
of things that are less than optimal, and a lot of things that are just 
plain wrong, and might very well blow up. :) Much of it is simply thrown 
together to test different ideas.

One of the things I'm really not sure about, is the character_set_traits 
concept that is in there now. The basic idea was to allow the library to 
be used with character sets that are not code point compatible with 
Unicode by abstracting this into another traits concept, and having the 
string class use that for it's external interface. This was an idea that 
seemed good at the time, but I'm getting more and more unsure about the 
usefulness of it. The biggest reservation I have against it, is that it 
basically makes it impossible to incorporate Unicode specific 
functionality in the string class' interface. (Functions for 
normalization and collation come to mind.)

Another thing is the way the Unicode Character Database is implemented. 
As of now, we simply generate one massive 2MB source file with the 
database as one gigantic array inside it. This of course leads to 
equally gigantic executables, which may or may not be desirable.

Anyway.. Just wanted to cover my behind before you start complaining. ;)

- Erik

[boost] Re: [Unicode strings] We're off

Erik Wien