
Dear Nemanja, On 12/5/06, Nemanja Trifunovic <nemanja_trifunovic@yahoo.com> wrote:
This is the second call for the informal review of the UTF8 library. It is based on verson 1.02 of UTF8-CPP: http://utfcpp.sourceforge.net/ and you can find it at
I like the functions you provide, and the "unchecked" namespace. Unlike Hervé, I do think exceptions are the way to go. I seem to miss a couple of things though. In a recent discussion on this list there seemed to be a preference for using iterators, which can be composed, for example to perform UTF-8->UTF-16 conversion, or conversions to other codepages. Iterators can be much more flexible than these free functions. Is there any particular reason why you do not include similar functions for UTF-16? One of the most important uses for UTF must be IO. Shouldn't a utf_codecvt be part of the library? Hervé is right: reading UTF-8 can be optimised a lot using tables with data. I've got an implementation lying around that I'd be happy to share. It took 30% less time than the straightforward implementation and it did all the necessary checks. The final thing is, your functions try to maintain strings with of valid UTF-8. Why not provide a string type that maintains this variant? Conclusion: in my opinion a lot of things are missing from the library at the moment. Regards, Rogier