Re: [boost] Boost.Unicode (was Re: Boost.Locale)

15 Dec 2010

      ----- Original Message ----
...
From: Mathias Gaunard <mathias.gaunard@ens-lyon.org>
- I need to finish  support for word, sentence and line boundaries
- The ABI needs to be more  clearly defined to guarantee backward and 
upward compatibility
- The  convert and segment subsystem must be clearly separated into its 
own library  and namespace
- The system must be made SIMD-ready
- Simple case  conversion should be added
- General case folding (and maybe collation)  should be added
Nothing among these is particularly  difficult.
Few notes or questions, you say that your library is locale agnostic,
I see a contradiction between what you say and what you need to implement

1. AFAIK boundary analysis is locale dependent.
2. case  conversion - is locale dependent - for example if the locale is Turkish
   then upper("i")=="İ" while upper("i")="I" for other languages.
3. collation - **is** locale dependent as text sorting in different languages
   is very different - even if they use same script (Latin for example)

Artyom