
On Fri, Jan 28, 2011 at 2:04 AM, David Bergman <David.Bergman@bergmangupta.com> wrote:
On Jan 27, 2011, at 10:04 AM, Dean Michael Berris wrote:
On Thu, Jan 27, 2011 at 10:55 PM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
[snip]
That's short, but not descriptive. The "i" prefix is more suggestive of "interface" than "immutable" to me. Why not just go whole hog and call it "immutable_string" as Artyom suggested?
The only objection really is that it's too long. :D Less characters is better.
/me gets a thesaurus and looks up string :D
Ok, but why this focus on immutability? Is that not a quite orthogonal concern to the encoding problematics discussed here (as well...)?
Two reasons why focus on immutability. First is that it deals with the underlying storage. This has to be "fool-proof" to avoid the problems of a mutable data structure. Unless you're certain that at any given point after the string is constructed that it will not change, then you can throw a lot of the potential optimizations at the algorithms and lazy transformations that can make certain operations a lot more efficient. Second is because encoding is largely a matter of interpretation rather than of actual transformation. What I mean by this is that an encoding is supposed to be a logical transformation rather than an actual physical transformation of data (although it almost always manifested as such) -- and it doesn't have to be an immediately applied algorithm either. So without the immutable guarantee from the underlying data type, you can't make "clever" re-arrangements at the algorithm implementation that can assume immutability -- things like caching data would not need to be done since the data wouldn't ever change, that copying data would be made largely unnecessary, and things like that.
I would prefer to have this discussion be about the encoding aspect(s) rather than immutability, unless the latter somehow intrinsically enable a much more improved handling (and preferably at the interface level) of various encoding, and I seriously doubt that.
Sure, and there are already algorithms that implement encodings that deal with ranges. They've always been there before. What's being talked about here is whether a string would have the encoding as an intrinsic property of a string -- and I maintain that the answer to that question (at least from my POV).
So, if we keep this discussion at that of a mutable sequence of characters, according to some encoding(s), I would be less grumpy.
So what's wrong with using ICU and the work that others have done with encoding already? Am I the only person seeing the problem with strings being mutable? (I honestly really want to know). -- Dean Michael Berris about.me/deanberris