
Erik Wien wrote:
Right now i have a single encoded_string class that has two template parameters, namely encoding and encoding_traits. encoding_traits is a class where all encoding specific implementation is kept, and this class is used to setup the encoded_string class to correctly represent strings in the given encoding.
Yes, that's close to what I thought. Do not repeat the basic_string mistake and make encoding_traits a template parameter. A traits class is never used in this way. Your encoding_traits is actually a policy. A traits class is independent of the components that use it. It is basically a mapping from a type to something; in your case, a mapping between the encoding parameter and the operations. So, encoding_traits aside, you essentially have string<utf8>.
The iterators used are bidirectional, not random access (impossible on UTF-8 and UTF-16) and they are as of now not constant. It IS possible to assign a code unit to a UTF-8 encoded string through an iterator, even if the resulting code unit sequence would be longer than the one the iterator is pointing to. The underlying container is automatically resized to make room for the new sequence. (This is of course slow!)
This is another basic_string mistake that effectively rules out efficient reference counting. ;-) Just make the iterators constant. The functionality can be obtained with explicit erase/insert/replace members.