
On Sun, Aug 14, 2011, Phil Endecot wrote:
As a general point, I believe it's a bad idea to hide a surprise like O(N^2) instead of O(N) complexity in a "rare" case. Doing so means that users will implement something that seems to work, and then get bitten later when it doesn't work in the field. (For example, the first time that a customer in Japan tries to process a 1 MB file and it takes a million times longer than expected.)
It would be better to not provide the inefficient case at all. Compare with how std::list doesn't provide random access, even though it could do so in O(N). Looking at your character set iterator, it seems to me that you could have a forward-only iterator and a bidirectional iterator for UTF, but only the former for these other encodings. Not storing the begin iterator when only forward iteration is needed also saves space.
Hmm it is possible that I can use SFINAE to disable the decrement function in the code point iterator, but I think it will probably impact function APIs that accepts generic template of unicode_string_adapter and expects the same behavior on all template instances. It will also require developer to manually lookup the documentation when using a string adapter with custom custom encoding traits. Of course having just one conditionally enabled method doesn't hurt that much, but I'd be wary on letting in too many conditional variations for different template instances of unicode_string_adapter. Anyway I wonder if there is any use case of developer storing content as large as 1MB in a single std::string object, and I doubt any operation being done on that string would be efficient. Since Boost.Ustr is a string adapter it can make reasonable assumption the same way as general purpose string classes do, and I don't think any general purpose string class'd expect to scale to that large of size. cheers, Soares