
Rogier van Dalen wrote:
Non-checking iterator adaptors can be faster. That would be useful when you know that a string is safe, for example, in a UTF string type that has a validity invariant.
I suppose that type of string should probably use optimized iterators that make use of the fact it is stored on contiguous and properly aligned memory anyway, so it will need special code.
I think this means that all iterator adaptors can be constructed from 3 iterators (begin, position, end) and the ones that don't check the input can also be constructed from 1 iterator. For a checking forward iterator, only two iterators are necessary (position, end). This is how I implemented this, at any rate.
Indeed, that makes 3 cases per encoding and I'm only handling the broadest case for now.
It makes sense to design for correctness. It's probably worth keeping in minds, though, whether conceivable extensions and optimisations are possible in your design.
I suppose you could attach traits to select more optimal iteration methods.
I like the idea of the Pipe and related concepts. I am wondering, however, whether the UTF-8 decoding iterator can be fast enough given the current specification. I think Pipe (or another concept) might have to support decoding of exactly one output element. Correct me if I'm wrong.
I don't really understand what you mean. Calling Pipe::ltr or Pipe::rtl only decodes one "element" (utf8 decoding means a multibyte sequence is read and a code point is written, utf8 encoding means a code point is read and a multibyte sequence is written).
The actual implementation of extensions and optimisations can be delayed until the need appears. I'd be happy to contribute checking policies.
The mechanism to do so has yet to be defined unfortunately ;).