
----- Original Message ----- From: "Edward Diener" <eddielee@tropicsoft.com>
A few points you probably already know:
1) Wide characters and Unicode characters are not necessarily the same thing for any given implementation. 2) There are quite a few Unicode encodings.
Yes I know. Thanks for the heads up though! ;)
3) The idea is to be able to plug in a Unicode encoding into the same standard library templates and boost templates which now support 'char' and wchar_t'. In other words ideally you want to treat your Unicode encoding as just another character type, with extra smarts depending on the encoding. The extra smarts would be used in specializations.
Agreed. That is one of the main design goals for a potential library in my opinion. I have recently created a little test library for simple unicode strings that provides iterators that can be used with the different algorithms in boost and std. I would probably base some parts of a new library on that implementation. I will post a new message with more information about this later.
In the past in comp.std.c++ I attempted to promote the idea that all standard library functionality which dealt generally in characters and strings should be parameterized on the character type for the sake of orthogonality and the future. While most are, there is still some functionality which does not, ie exceptions and file names and locale message files, and assume that only narrow characters exist in its usage. I am still amazed that programmers from countries which would normally use wide characters as Unicode encodings, such as the Japanese, have not made more of an issue with this, but perhaps they are so used to their far more difficult DBCS roots that pursuing wide characters everywhere, much less a real Unicode encoding, is a minor issue with them.
I completely agree. There are a few areas of the standard that makes a lot of assumptions about how characters and strings are represented, and many of these assumptions are not necceseraly true when it comes to unicode. How to match a potential library with the standard is therefore an important issue in the development, and one I hope to devote some time to resolve (Or at least knowingly ignore! ;) ) if I move forward with the project.