
Dale Peakall wrote:
So that, there will be variants accepting char and wchar_t data types, and all possible unicode problems will be addressed by char_traits and locale.
I have to agree. Programs should internally work in terms of fixed-width character sets. When string data needs to be imported/ exported locales should be used to perform the transformation.
I would make program_options support an imbue() function that allows a locale to be specified (otherwise use the default locale) and template any functions that need to process strings on the character type.
Why bother? As I've indicated, internal processing will work just fine with UTF-8, and no interface of the library will expose a UTF-8 encoded string. imbue() is a good thing for specifying what encoding char** input has, but it's orthogonal to the rest of the library -- it's used only on interface boundary.
This provides much more flexibility than just supporting UTF-8. UTF-8 is a really impractical encoding for almost any locale where the majority of text is not ASCII like and the user may well prefer to encode text is Shift-JIS or other encodings.
Again, in my case the user does not use UTF-8 string, so why would he care how the strings are encoded internally? - Volodya