
On Wed, Jan 26, 2011 at 10:37 AM, Yakov Galka <ybungalobill@gmail.com> wrote:
Excuse my ignorance, but can someone explain to me why people are so keen on immutable strings? Aren't they basically the same as 'shared_ptr<const std::string>'?
I'm fairly neutral on the immutability issue, I do not oppose it if someone shows why it is a superior design, provided it does not break everything horribly (from the backward compatibility perspective).
I follow these discussions, and I must admit that I already use std::string in my projects with utf8 encoding assumed by default. What matters for me is the lack of a "standard" way to manipulate those strings. I.e.: 1) Convert them to and from other APIs' encoding: SetWindowTextW(to_utf16(my_string)); 2) Iterate through the codepoints, characters, words etc.. like this: for(char32_t cp : codepoints(my_string)) ...;
+1
The original proposal (in the other thread) was to use the type of the string to ensure at compile time that the above code is valid. I understand that it is needed in the current world where not everybody uses utf8. It's fine for me. But why
On Fri, Jan 21, 2011 at 13:25, Matus Chochlik <chochlik@gmail.com> wrote:
create a class called boost::string that will have all the properties that a string handling class in 2011+ A.D. should have, basically what std::string should have been.
The original proposal was to keep the existing string but to switch to UTF-8 as the default encoding. This is what still is my long term goal. The whole discussion changed my opinion on how to get there. I personally would not have any problem with doing the instant switch .. but many other people would, and with good reasons.
? What are those properties? Isn't std::string *is* what it should have been? Do you mean that you want to put there in any possible algorithm you can imagine?
What I was talking about is basically adding some more convenience member functions, many of which are currently implemented by the string_algo Boost library, to the strings interface and more importantly to extend the strings interface with 'Unicode-functionality' i.e. the ability to traverse the string not just as a sequence of bytes but as a sequence of Unicode code-points and if possible even "logical characters".
IMO std::string is just a container of bytes with two useful convenience methods (c_str() and substr()) and a utf8 encoding that had to be assumed by default but unfortunately isn't. Everything else should be generic algorithms that work with sequences of characters in some encoding. So, maybe it's better to focus on designing something like boost::iterator_range with an encoding associated with it and algorithms that work with these ranges?
I that is to succeed it has to be (backward)compatible with the existing APIs, however borked they seem to us (me included). There are lots of strings implementations that are *cool* but unusable by anything except algorithms specifically designed for them. Matus