
On Wed, Jan 26, 2011 at 11:54, Matus Chochlik <chochlik@gmail.com> wrote:
On Wed, Jan 26, 2011 at 10:37 AM, Yakov Galka <ybungalobill@gmail.com> wrote:
Excuse my ignorance, but can someone explain to me why people are so keen on immutable strings? Aren't they basically the same as 'shared_ptr<const std::string>'?
I'm fairly neutral on the immutability issue, I do not oppose it if someone shows why it is a superior design, provided it does not break everything horribly (from the backward compatibility perspective).
Me too, but it definitely will break existing code: string.resize(91); [...]
? What are those properties? Isn't std::string *is* what it should have
been?
Do you mean that you want to put there in any possible algorithm you can imagine?
What I was talking about is basically adding some more convenience member functions, many of which are currently implemented by the string_algo Boost library, to the strings interface and more importantly to extend the strings interface with 'Unicode-functionality' i.e. the ability to traverse the string not just as a sequence of bytes but as a sequence of Unicode code-points and if possible even "logical characters".
IMO std::string is just a container of bytes with two useful convenience methods (c_str() and substr()) and a utf8 encoding that had to be assumed
by
default but unfortunately isn't. Everything else should be generic algorithms that work with sequences of characters in some encoding. So, maybe it's better to focus on designing something like boost::iterator_range with an encoding associated with it and algorithms that work with these ranges? I that is to succeed it has to be (backward)compatible with the existing APIs, however borked they seem to us (me included). There are lots of strings implementations that are *cool* but unusable by anything except algorithms specifically designed for them.
I can't exactly understand what has to be backward compatible with what... Can you please provide a few code snippets that mustn't break so I could
My point is that 'Unicode-functionality' should be separate from the string implementation. This code for(char32_t cp : codepoints(my_string)); should work with any type of my_string whose encoding is known. I'm not against adding convenience functions into the string. It makes the code more readable when you concatenate operations. However, it violates this: http://www.drdobbs.com/184401197 think about that? -- Yakov