[boost] Re: Any interest in adding unicode support to boost?

22 Oct 2004

      "Erik Wien" <wien@start.no> wrote in message
news:cl1tqh$qp$1@sea.gmane.org...
...
Hi. I am in the process of planning a library for handling unicode strings
in C++, and would like to probe the interest in the boost community for
something like that. I read through the unicode dicussion that was up back
in april, and from what I could gather there was some amount of interest,
but no one felt comfortable taking on the task as of yet.
[snip]
...
I really feel the C++ language needs some form of standardized unicode
support, and developing such a library within the boost community would be
a
very good way to ensure it fits everybody's needs the best possible way.
If you have any, and I do mean ANY, thoughts on this, please do not
hesitate
to reply to this mail and let me know. I'm looking forward to your
responses.
FWIW Here my thoughts..

There is no equivalence between std::string (aka std::string, std::wstring)
and  a sequence of characters conforming to an encoded sequence (aka
encoded-string).

However  an encoded-string can (potentially) be converted to a string, but
not the other way round, because the std::string does not provide adequate
information.

For an encoding scheme to work the encoding must be provided, and must be
run time. The best way to do this for various encodings is to use packets,
with headers providing the information regarding the contents, eg type of
encoding, number of characters, checksum etc. These packets themselves could
be manipulated in std::strings (including sequences of packets), which could
then be used to perform operations where the encoding is not important.
This should combine the best combination of performance, both in speed and
size.

regards
Andy Little