Re: [boost] [general] What will string handling in C++ look like in the future [was Always treat ... ]

20 Jan 2011

      ...
...
OK, if the long term plan is:
1)  design and implement boost::string using UTF-8 doing all the things
like  code-point iteration, character iteration, convenience stuff like
 starts-with, ends-with, replace, trim, etc., etc. with as much
backward  compatibility with std::string as possible without hindering
 progress
2) try really hard to push it to the  standard
then I'm on board with that.
Some of those could  be problematic (I've run across references implying
that 0x20 isn't the  universal word-separation character, so trim would
at least need some extra  parameters), but for the most part, I'd agree
with it.
And also it is locale dependent.

Unicode defines 4 text segments: Grapheme, Word and Sentence.

   http://www.unicode.org/reports/tr14/ 

There is also line break boundaries defined:

   http://unicode.org/reports/tr29

Most of them are also locale dependent as require use of
dictionaries.

So unless you want to carry locale information in the string,
I don't think it is good to put these into the string itself.

Artyom