Re: [boost] UTF8 library - second call for informal review

6 Dec 2006

      Dear Nemanja,

On 12/5/06, Nemanja Trifunovic <nemanja_trifunovic@yahoo.com> wrote:
...
This is the second call for the informal review of the UTF8 library. It is based on verson 1.02 of UTF8-CPP: http://utfcpp.sourceforge.net/ and you can find it at
I like the functions you provide, and the "unchecked" namespace.
Unlike Hervé, I do think exceptions are the way to go. I seem to miss
a couple of things though.
In a recent discussion on this list there seemed to be a preference
for using iterators, which can be composed, for example to perform
UTF-8->UTF-16 conversion, or conversions to other codepages. Iterators
can be much more flexible than these free functions.
Is there any particular reason why you do not include similar
functions for UTF-16?
One of the most important uses for UTF must be IO. Shouldn't a
utf_codecvt be part of the library?
Hervé is right: reading UTF-8 can be optimised a lot using tables with
data. I've got an implementation lying around that I'd be happy to
share. It took 30% less time than the straightforward implementation
and it did all the necessary checks.

The final thing is, your functions try to maintain strings with of
valid UTF-8. Why not provide a string type that maintains this
variant?

Conclusion: in my opinion a lot of things are missing from the library
at the moment.

Regards,
Rogier

Re: [boost] UTF8 library - second call for informal review

Rogier van Dalen