
On Jan 23, 2011, at 9:34 PM, Dean Michael Berris wrote:
On Sat, Jan 22, 2011 at 10:43 AM, Chad Nelson <chad.thecomfychair@gmail.com> wrote:
On Sat, 22 Jan 2011 01:56:36 +0800 Dean Michael Berris <mikhailberis@gmail.com> wrote:
I think strings are different from the encoding they're interpreted as. Let's fix the problem of a string data structure first then tack on encoding/decoding as something that depends on the string abstraction first.
That gets back to the problem that I was originally trying to solve with the UTF types: that a string needs a way to carry around its encoding. A UTF-8 type could be built on such a thing very easily.
Hmm... I OTOH don't think the encoding should be part of the string. The encoding is really external to the string, more like a function that is applied to the string.
It's a property of the string. It may change, but some encoding (even if it's just "none") should be associated with a particular string throughout its existence. Otherwise you might as well use the existing std::string.
I think I disagree with this. A string is by definition a sequence of something -- a string of integers, a string of events, a string of characters. Encoding is not an intrinsic property of a string.
Ok... it feels like you are changing the rules as we play, instead of admitting "defeat" ;-) Or, did you indeed talk about *generic sequences* this whole time? If so, why the focus on encoding strategies for characters? /David