
Phil Endecott wrote:
Things I'd appreciate feedback on: - What should the cs_string look like? Basically everywhere that std::string uses an integer position I have the choice of a character position, a unit position, or an iterator - or not providing that function.
I think emulating std::string doesn't work. It has a naive design based on the assumption of fixed-width encodings. I think that a tagged string is the best place to really start over with a string design and produce a string that is lean, rather than bloated. I think the string type should offer minimal manipulation facilities - either completely read-only or append as the only manipulation function. A string buffer type could be written as a mutable alternative, as is the design in Java and C#. However, I'm not sure how much of that interface is needed, either. I'd love to have some empirical data on string usage.
- What character sets are people interested in using (a) at the "edges" of their programs, As many as possible. Theoretically, a program might have to deal with any and all encodings out there. Realistically, there's probably a dozen or two that are relevant. You'd need empirical data. and (b) in the "core"?
ASCII, UTF-8 and UTF-16. Sebastian