
On Mon, Feb 25, 2008 at 8:09 AM, Sebastian Redl <sebastian.redl@getdesigned.at> wrote:
Phil Endecott wrote:
Things I'd appreciate feedback on: - What should the cs_string look like? Basically everywhere that std::string uses an integer position I have the choice of a character position, a unit position, or an iterator - or not providing that function.
I think emulating std::string doesn't work. It has a naive design based on the assumption of fixed-width encodings. I think that a tagged string is the best place to really start over with a string design and produce a string that is lean, rather than bloated.
I agree.
I think the string type should offer minimal manipulation facilities - either completely read-only or append as the only manipulation function.
I would like to have at least a modifiable string. But only through iterators (insert and erase). That should suffice all my algorithm needs.
A string buffer type could be written as a mutable alternative, as is the design in Java and C#. However, I'm not sure how much of that interface is needed, either.
A modifiable iterator interface (with insert and erase) is, IMO, as concise and extensible as possible.
I'd love to have some empirical data on string usage.
I do some string manipulations on email. And it is usually better to do all manipulations in the codepage received, instead of converting back and forth.
- What character sets are people interested in using (a) at the "edges" of their programs, As many as possible. Theoretically, a program might have to deal with any and all encodings out there. Realistically, there's probably a dozen or two that are relevant. You'd need empirical data.
Unfortunately I need all supported by MIME.
and (b) in the "core"?
ASCII, UTF-8 and UTF-16.
ISO-8859-1 ?
Sebastian
-- Felipe Magno de Almeida