
I plan to submit during the week my proposal for the Summer of Code about Unicode. I plan to provide: - iterator adaptors to iterate sequences of code units, code points and graphemes, and eventually more, from a sequence in UTF-8, UTF-16, UCS-2 or UTF-32/UCS-4. - miscellaneous utilities, such as categorization of code points - normalization functions - comparisons but not collations - substring search algorithms - and finally, an unicode string type I am well aware defining yet another new string type is quite controversial, but I believe this is quite useful. A dedicated type would be able to maintain certain invariants, such as maintaining a special normalization form. Also, I believe it can be possible to come up with a string design that allows easy integration with any other existing string type, such as the ones from the standard or Qt.