
On Sat, 12 Feb 2011 11:00:31 -0800 Jeremy Maitin-Shepard <jeremy@jeremyms.com> wrote:
The size in code-points *is* the size of the string, according to the view of the string that the class exposes.
Ok, but what would I actually want to use that for?
What do you use string.length() for? :-) Efficiently providing an answer to that is one of several things the UTF string classes keep track of it for.
std::string::length specifies the amount of memory required to represent it as encoded, and is useful if you intend to pass it to something else as a char array, length pair. Given that number of code points is directly related to neither the memory required nor the number of logical characters/glyphs/size it will take up to display, it seems it is unlikely to be useful in many cases.
But for those few cases where it *would* be useful, I see no reason not to provide it. It costs essentially nothing, since the count is originally provided by the same function that validates the encoded data when it's put into a UTF type, and is used for other things as well. And people are used to being able to retrieve the size of a string, eliminating that function would discomfort some developers.
In cases where there is a limit of the maximum length of a string, I believe that is almost certainly going to be in terms of the encoded length in a particular encoding (i.e.g UTF-8 or UTF-16), rather than in code points.
Well, that's easily available too, via T.coded().length(). -- Chad Nelson Oak Circle Software, Inc. * * *