Re: [boost] [rfc] I/O Library Design

19 Jun 2007


      Mathias Gaunard <mathias.gaunard@etu.u-bordeaux1.fr> writes:
...
Jeremy Maitin-Shepard wrote:
...
It occurs to me that perhaps it is not unreasonable after all to
restrict the library to supporting Unicode encodings for in-memory
character representation.
...
I personally believe Unicode (not only the character set, but also its 
collations and algorithms) is the only viable way to represent 
characters, and thus should be the way strings work with. (get out evil 
locales and other stuff!)
Of course, various encodings can still be used for serialization.
I agree that I personally would always want to use a Unicode encoding
for handling text in my software.  The question, though, is whether the
new I/O library should actually force users to use a Unicode encoding
for internal text representation.  Even if other internal encodings are
supported, Boost might still only provide actual text formatting
facilities and other high-level text facilities for all Unicode
encodings (UTF-8, UTF-16, and UTF-32) or even only a single Unicode
encoding.
...
Unfortunately, C++ is quite far from having good Unicode tools (not that 
other programming languages are really better -- Unicode is simply quite 
complicated, because human languages just are)
...
ICU has most of the stuff, but not with the right interfaces.
A better I/O system might provide a very solid base on top of which
proper higher level text facilities can be provided, quite possibly by
incorporating pieces of ICU.

-- 
Jeremy Maitin-Shepard