
Phil Endecott wrote:
No, because:
The immutability of the string adapter is actually achieved by holding a smart pointer to the const version of the raw string.
If you were just wrapping an existing string class, you wouldn't do that; you'd just wrap the existing string class. By adding this extra bit, you're making a string that is immutable, copy-on-write and reference counted - whether or not the underlying string is or not.
I wouldn't argue with you about the definition of new string class but as long as you understand the design goal then it's fine. For me I'd say that the unicode_string_adapter class is more like a glorified smart pointer, as you can assume the class to have the following signature with identical functionality: <typename StringT> class unicode_string_adapter : public std::shared_ptr<const StringT>; and I'm just using composition over inheritance because that brings better organization to the code. Also the class does not exactly do the same copy-on-write as std::string used to. It *always* copy when the edit() method is called regardless of whether it has one or many reference count. So there is no nasty overhead of making sure to have only one reference count during mutation.
Whatever. The point is that you have this operator* and operator-> overload whose purpose is non-obvious to someone looking at code that uses it. What is your rationale for doing that, rather than providing e.g. an impl() or base() or similar accessor? Can you give examples of any precedents for this usage? What names or syntax do other wrapper/adaptor/facade implementations use?
I'd say the purpose of operator *() is pretty obvious: to retain backward compatibility with the original raw string class. One of the biggest obstacle of creating new string class is that it will break compatibility with legacy library APIs that accept std::string in the function parameter. My goal is to make it as easy as possible for users of Boost.Ustr to get back their original raw string at any time when needed, so that it is less painful in migration. Ultimately a developer should can use `unicode_string_adapter<std::string>` with only his existing knowledge on std::string. The developer does not need to learn Boost.Ustr at all if he does not care about the encoding and content of the string, and all he have to do in his code to migrate to Boost.Ustr is just to replace all str.string_method() to str->string_method(), and existing_function(str) to existing_function(*str). As a result, the syntax makes it extremely easy to migrate with minimal changes. There is already a member function that does the actual implementation, which is str.to_string(). So it will be just be a matter of deleting three lines of code to remove operator *() anyway. But if you look at unicode_string_adapter itself as a smart pointer to the raw string, then operator *() would make more sense.
Well I don't really care who does it, but I think we should have these UTF encoding and decoding functions somewhere in Boost that is not an implementation detail of some other library.
I'd agree with you that Boost needs a complete toolset of Unicode library. But since that is out of my project scope I'll leave it to others to answer this question.
OK, it's not for me, that's a shame. Maybe if you're lucky someone who DOES want this functionality will now post a reply to your request for comments...
Yup.. Basically whatever raw string processing algorithm that cannot work well enough with the standard std::string implementation should not be made to work with Boost.Ustr as well. Actually I don't think there is any general purpose string class that can does the job you want as well.