Re: [boost] UTF-8 conversion etc.

25 Feb 2008


      Phil Endecott wrote:
...
Things I'd appreciate feedback on:
- What should the cs_string look like?  Basically everywhere that 
std::string uses an integer position I have the choice of a character 
position, a unit position, or an iterator - or not providing that function.
I think emulating std::string doesn't work. It has a naive design based
on the assumption of fixed-width encodings. I think that a tagged string
is the best place to really start over with a string design and produce
a string that is lean, rather than bloated.
I think the string type should offer minimal manipulation facilities -
either completely read-only or append as the only manipulation function.
A string buffer type could be written as a mutable alternative, as is
the design in Java and C#. However, I'm not sure how much of that
interface is needed, either.
I'd love to have some empirical data on string usage.
...
- What character sets are people interested in using (a) at the "edges" 
of their programs,
As many as possible. Theoretically, a program might have to deal with
any and all encodings out there. Realistically, there's probably a dozen
or two that are relevant. You'd need empirical data.
 and (b) in the "core"?
ASCII, UTF-8 and UTF-16.

Sebastian

Re: [boost] UTF-8 conversion etc.

Sebastian Redl