Re: [boost] [string] proposal

27 Jan 2011

      On Thu, 27 Jan 2011 12:51, Nevin Liber <nevin@eviloverlord.com> wrote:
...
I'd like to see this broken up into three discussions:
1.  Immutable strings.
Immutable or not, I don't see a direct use for modification of 
individual code-units (e.g. char, wchar_t) in a string. Too many things 
can go wrong. Some kind of manipulation of code-points, yes, but not 
code-units.

Anyway, code-points are not the end either. Multiple code-points may be 
needed to represent a grapheme, using combining characters. And 
sometimes a single code-point can represent several graphemes, such as 
ligatures.
...
2.  utf8 strings.
Although I personally prefer UTF-8 encoded strings, the internal 
encoding is more or less irrelevant for an implementation based on rope 
or similar non-contiguous data structure. I believe this is what Dean 
Michael Berris is suggesting. I think this is especially true if direct 
access to individual code-units are prevented.

For an implementation using a contiguous data structure and providing a 
constant time c_str member function I'd really want to see some option 
to set the internal encoding of strings. Performance-wise it may be 
preferred to use UTF-16 internally when using for example Win32 API, if 
an extra copy can be avoided.
...
3.  Unrealistic pipe dream about replacing std::string.
Replacing std::string will never happen. Deprecating std::string in 
favor of std::text/std::unicode/std::xstring may happen in the long run.

Regards,
Anders Dalvander

-- 
WWFSMD?