
Sebastian Redl wrote:
Phil Endecott wrote:
For a UTF-8 string, my proposal offered
a mutable random-access byte iterator
What is the use case for this?
It's for when you want to treat the data as a sequence of bytes. For example, another thread at the moment is discussing base64 encoding. The input to a base64 encoder could be a byte stream iterator. There are also cases where you can exploit knowledge about the encoding to use a byte iterator in place of a character iterator. Specifically, in UTF-8 all bytes after the first of a multi-byte character are
=128. So in a parser, I might want to skip forward to the next '"', or '<' or whatever; since those are both <128, I can do this significantly more efficiently using the byte iterator.
Concerning mutable vs. immutable strings: which is best in any particular case clearly depends on the size of the string, the operation being performed, and whether it has a variable-length encoding. The programmer should be allowed to choose which to use. (An interesting case is where the size or character set changes at run-time, and a run-time choice of algorithm is appropriate.)
Why on earth would you change the character set of a string at runtime?
I should have written "where the size or character set _varies_ at run-time". Phil.