Re: [boost] Strings tagged with their character set

27 Sep 2007

      "Phil Endecott" <spam_from_boost_dev@chezphil.org> writes:

[snip]
...
I'm wondering about offering distinct "unit" (e.g. byte) and 
"character" types in the charset_traits class, and providing separate 
unit_iterator and character_iterator types and operations.  Or maybe 
the character_iterators are best provided by some sort of "adapter"
layer?
I think providing the code point iterators in a adapter layer is better.
The reason is that iterating over code points is just one of several
higher-level-than-byte- iterations that might be useful.  In particular,
it seems that for many string manipulation tasks, even iterating over
code points is not sufficient to handle international text; rather, it
may be necessary to iterate over grapheme clusters.

[snip]

-- 
Jeremy Maitin-Shepard