
OvermindDL1 wrote:
On Sat, Feb 14, 2009 at 10:07 AM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
/* snip */ Yes, a Unicode character properties library is important to those who are writing text editors and similar applications. Perhaps Boost should have one. I have personally used the Unicode properties tables for doing "approximate matching" of e.g. accented characters with their base characters when searching. But I can do that equally well in UTF-8 as in UTF-32.
If you are all interested in other opinions, I would love for boost to have a UTF8(16/32) helper library.
There is a google summer of code project for a unicode library which I'm working on. It allows handling of unicode text in any of UTF-8, UTF-16 or UTF-32 encodings, bundles a small-ish unicode character database, supports grapheme boundaries, composition/decomposition and normalization, but not "approximate matching", collation or case folding (at least it won't for the time being).