
On 30/01/2011 08:46, Artyom wrote:
If my strings are valid and normalized, I can compare them with a simple binary-level comparison; likewise for substring search, where I may also need to add a boundary check if I want fine-grain search.
No you can't
For example when you search word שלום you want to find שָלוֹם as well (with diactrics) that are not normalized.
Unless I understand that wrong, they're as equal as e is equal to é or a is equal to à.
Search and Collation require much more complicated levels comparison.
Right, I'm talking about exact comparison, not collation. Exact comparison is what you use in most text processing and parsing. You can perform collation folding with the right level if you want those two strings to compare equal.
The problem that I may want 00e0 (à) and 0061 0300 (a + `) and 0061 (a) to be equal for string search as well.
You may, but that should not be the default behaviour of operator== and operator<.