
"Rogier van Dalen" <rogiervd@gmail.com> wrote in message news:e094f9eb0410200629617a4e01@mail.gmail.com...
On Wed, 20 Oct 2004 15:51:21 +0300, Peter Dimov <pdimov@mmltd.net> wrote:
Vladimir Prus wrote:
Peter Dimov wrote:
But if you need a particular normalized form for other purposes (to store it into a database, perhaps), you have no way to obtain it from operator==.
Yes. But it's possible to have standalone "normalization" function, and still use default normalized representation for the string class.
Thereby assuming that all users need to pay for normalization (twice) on every comparison?
Or maybe you are arguing that the string should always be kept in a particular normalized form?
That seems to be the only way of keeping comparison, search, etcetera, implementable in terms of char_traits<> functions --- and so, the only way of getting performance similar to std::basic_string<>'s.
Note that normalisation of any kind requires access to the Unicode Character Database, which may take some time, especially if the relevant parts happen not to be in the processor cache.
Comparing any Unicode data in different or unknown normalisation forms will therefore by definition be slow.
True.. So what we basically need to determine, is what is most critical? Fast comparing of strings (Strings always represented in a given NF), or fast genereal string handling (NF determined when needed)