[boost] Re: Boost Unicode support ideas

13 Apr 2004


      In article <87isg3skr6.fsf@jbms.ath.cx>, Jeremy Maitin-Shepard <jbms@attbi.com> 
wrote:
...
Right, it will certainly be necessary to provide a grapheme_cluster_iterator 
(with value_type = the Unicode string type).  ICU should help with this.
You are conflating abstract characters (which exist in absence of a graphical 
representation) and graphemes (whose existence is dependent upon the graphical 
representation), but I believe we are talking about the same thing.
...
Nonetheless, it is useful to represent a single code point, for several 
reasons:
I agree; as I mentioned elsewhere, I believe that the Unicode string abstraction 
needs to support at least iteration by abstract characters, encoded characters, 
and encoding units.
...
- For the purpose of string construction, the Unicode specification
   explicitly states that any sequence of code points is well formed, and so 
   this provides the smallest unit by which guaranteed-well-formed strings 
   can be formed.
Can you refer me to a specific point in the spec where this is stated?
...
- It would be useful to provide functions for querying the Unicode
   properties of individual code points, and this code_point type would be 
   the only suitable parameter type.
Absolutely.
...
I do agree, however, that for almost any output formatting, the 
locale-specific or user-specified fill text/symbols should be specified as 
strings, rather than as individual characters.
Yes.

meeroh

-- 
If this message helped you, consider buying an item
from my wish list: <http://web.meeroh.org/wishlist>

[boost] Re: Boost Unicode support ideas

Miro Jurisic