
On Apr 4, 2005 12:51 PM, Erik Wien <wien@start.no> wrote:
Sorry about the late reply. I have been away for easter, and to top it all off, been sick a while. Anyway, I'm back...
Great; we'll go on with the discussion. I'm glad you agree with me on most points. :-)
Yep.. You are of course right. I should start thinking before I talk. :)
I don't know... thinking out aloud often works in abstract discussions like this.
Having strings locked to a normalization form, would be the most logical way to go. What I don't really see though, is why you would have to have a separate class (different from the code point string class that is) for this functionality. If we made the code point string classes (both the static and dynamic ones) have a normalization policy and provide a policy that doesn't actually do anything, in addition to ones that normalize to each of the normalization forms, everyone could have their way. If you don't care about normalization, use the do_nothing one. If you do care (or simply have no clue what normalization is - most users), use NFD or NFC or something.
I'm not sure about this. The simplicity point is a good one. Assuming you do want to have built-in grapheme cluster support, I do however see two problems with this approach: 1. You'd still need two kinds of iterators: iterators over codepoints, and iterators over grapheme clusters. This makes things conceptually muddy for users, I think. The string class will need codepoint versions and grapheme cluster versions of many methods (e.g., insert, erase, find*). You may end up actually implementing two strings in one string class. 2. Elements are not straightforwardly inserted into the sequence. E.g., appending 0x317 (a combining character) to a string s will not make s.back() return 0x317. In short, a code point string that automatically normalises is not a Sequence, though it may superficially look like one. I have a feeling this would be more difficult to understand for users than two separate string classes would. But maybe that's because I already understand my own viewpoint? Regards, Rogier