Re: [boost] [string] proposal

27 Jan 2011


      On 01/26/2011 11:34 PM, Dean Michael Berris wrote:
...
On Thu, Jan 27, 2011 at 3:19 PM, Patrick Horgan<phorgan1@gmail.com>  wrote:
...
On 01/26/2011 07:54 PM, Dean Michael Berris wrote:
...
... elision by patrick ...
Yes, but really I think the view<encoding>    is the encoding-aware
string type mostly because if you convert it to an std::string for
example or into a buffer and look at it like a `char const *` or even
`wchar_t const *` then you basically get what you'd need for the C or
OS APIs.
I just prefer calling a spade a spade and not say `string` when I
really mean a `view<encoding>` -- because largely I think everyone
would agree that the string data structure really doesn't have an
intrinsic property that relates to an 'encoding'.
But what some are talking about is a utf-8_string.  I know it's not what
you're talking about, but saying that everyone would agree would be a bit
disingenuous and discount much of the preceding discussion.
So you're saying, utf8_string is not view<utf8_encoding>  as far as
I've already described it?
Exactly.  Others have expressed repeatedly that they want a string with 
intrinsic encoding.
...
I really wish this discussion would split into two, because the discussion
about the benefits of an immutable string, and the discussions of an utf
encoded string are two completely different discussions and you keep butting
heads each saying, no, but that's not what I'm talking about.
Really, if you read the recent discussions, you will see that we're
really talking about the same thing: a data structure that knew the
encoding somehow. That somehow is, and has been determined (and agreed
upon already) already suitably modeled by a view<...>  that takes a
string for a suitable definition of string. Note that the string *has
no encoding that is intrinsic to it*.
Yes.  I understand clearly that you have been talking about that.  
Others talked about a string with intrinsic encoding.
...
That's right.  There were several threads, but everyone's jumped onto this
one which I believe was started by Mr. Berris to talk about the benefits of
an immutable string.  Please, please, separate these threads again.
So Mr. Berris is saying right now, if you didn't see the point: your
"utf8_string" is really just a typedef to view<utf8_encoding>. The
only *reasonably efficient* way of achieving this view design is if
you had immutable strings. The thread has already hashed out *why*
mutable strings is a bad thing (performance and design-wise) for
encoding-aware algorithms. I don't see why we need to go back to that
*again*.
Me either.  Of course the other discussion about strings with intrinsic 
encoding should be in another thread.
At any rate feel free to convince me otherwise that immutable strings
wouldn't be a good thing for
encoding/transcoding/string-or-text-centric algorithms. ;)
Why on earth would I do that?  They would be wonderful for many 
applications and as you said and I agreed to days ago, why would you 
want to pay an extra price for a mutable string when you didn't need 
one.  Of course when you did need one you'd just use it.
I'd just like to see your thread as you began it, a discussion about the 
benefits of an immutable string.  I particularly didn't like that it got 
hijacked to focus on how appropriate it would be for a string that 
represented a particular encoding.  Of course that's something to think 
about but there's a lot more to the benefits of an immutable string than 
that, and you started off doing a good job of discussing it before you 
got distracted.  I just want to see the discussions split again so in 
this thread discussions of all aspects of immutability vs mutability 
could be discussed.  It seems now that you are only interested in 
discussing encodings and views.  I wanted the discussion of immutability.

Patrick