
Having run into string-related problems myself in my iochain project, I'm thinking about character encodings again. (I don't like the term character set. It ought to mean something different from its common usage. UTF-8 and UTF-16 aren't character sets.) Phil Endecott wrote:
For a UTF-8 string, my proposal offered
a mutable random-access byte iterator
What is the use case for this?
Concerning mutable vs. immutable strings: which is best in any particular case clearly depends on the size of the string, the operation being performed, and whether it has a variable-length encoding. The programmer should be allowed to choose which to use. (An interesting case is where the size or character set changes at run-time, and a run-time choice of algorithm is appropriate.)
Why on earth would you change the character set of a string at runtime? Robert O'Cahallan (roc of Mozilla) recently blogged a bit about strings. Mozilla is a project that has a lot of experience with strings, so I put quite some weight on his opinion. http://weblogs.mozillazine.org/roc/archives/2008/01/string_theory.html Sebastian Redl