[boost] Re: Re: Any interest in adding unicode support to boost?

21 Oct 2004

      Erik Wien wrote:
...
...
It's good to have one string class for library interoperability reasons.
Otherwise library A would demand utf8_string, library B would demand
utf16_string, and library C would demand utf32_string. No matter which
one you choose, you'll pay a price. (This doesn't change even if you
spell utf8_string as string<utf8>.)
That is true. Though the strings of different encodings should be
assignable to each other, libraries taking references to encoded_strings
would need some conversion to be done.
We have a similar problem today with basic_string<char> and
basic_string<wchar_t>, and I think it could also be solved in a way that
is very similar to what is done in the <string> header.
Just to clarify: the string and wstring in the standard have a huge problem:
you can't convert string to wstring in any way: there's just no appropriate
converting constructor.
...
If we typedef a 
unicode_string or something as encoded_string<utf16>, and promote that as
THE string class, most users would use that as their primary string
representation, and simply be oblivious to the underlying encoding. (A
good thing.)
That would still make it easy for a user to use some different encoding
without good reason.
...
Advanced user could (just like we do today with basic_string) choose to
support multiple encodings by templating their own functions on encoding
as well.
Oh well. I just hope nobody will ever make an implementation of 

   XML parser + XML Schema + XPath + XQuery + SOAP + HTML renderer

which is fully templated on string type, unless the same person speeds up
gcc by 10 times previously.

- Volodya

[boost] Re: Re: Any interest in adding unicode support to boost?

Vladimir Prus