
On Sep 27, 2007 5:31 PM, Joseph Gauterin <joseph.gauterin@googlemail.com> wrote:
Variable-width-encoded strings should be fairly straightforward when they are immutable, but will probably get hairy when they can be modified. True. I think the strings should be immutable. I think experience with Java and C# compared to C++ shows that an immutable string class is superior in most use cases
I also agree that immutable strings are the way forward for VWEs.
I believe that some thinking must be spent on this issue. Considering Java (no experience with C#), memory allocation is really fast due to the way the garbage collector is implemented. Creation of new strings for each change is quite cheap, only the data copying part, and even then there are other ways (StringBuffer/StringBuilder) that offer a modifiable version of a string. Then there are other things in JVM that helps speeding up, but with some associated _cost_. Strings can be 'internalized' by the jvm, keeping just one copy of each different string in memory and sharing references. This is a side effect of inmutability: there is no problem sharing the same copy as it cannot change. Then problems arise if you offer access to the raw data, which is a feature in Java: suddenly inmutable strings can be changed and System.out.println( "Say Hi!" ) can print 'Goodbye' if some other code anywhere else changes the raw data. This cannot be avoided in Java as final is a property of the reference, not the associated data. On the other hand, I believe it can be correctly implemented in C++.
If we had mutable strings consider how badly the following would perform: std::replace(utfString.begin(),utfString.end (),SingleByteChar,MultiByteChar);
Although this looks O(n) at first glance, it's actually O(n^2), as the container has to expand itself for every replacement. I don't think a library should make writing worst case scenario type code that easy.
While this is a problem that I don't know if has a solution, an alternative replace can be implemented in the library that performs in linear time by constructing a new string copying values an replacing on the same iteration. Could std::replace() be disabled somehow?? (SFINAE??)