
On Sat, 29 Jan 2011 23:20:07 -0800 Patrick Horgan <phorgan1@gmail.com> wrote:
[...]
I still like the idea of a utf-8_string that enforces correct encoding, i.e. it won't let you make a change to the string that would make the above external function is_valid_utf8 return false. [...]
Don't worry, I'm still working on that. :-) I've already modified the classes so that they're all guaranteed to preserve valid encoding at all times, and added as many std::string functions to each of them as is feasible (utf32_t handles all of them, the others only a subset). I've nearly finished integrating the policy-based stuff, so you can tell it what action to take on errors in the input data. The only major thing left is designing a way to properly convert data that *isn't* in UTF format, and from studying Artyom's Boost.Locale, I have a few ideas on how to do that. It should be ready for public critique by the end of next week, at the latest. It won't have a true character iterator at that point, but as I envision it, that can be added on separately. I'm not sure about the proxy class design that I've had to use for utf32_t's operator[] and at() functions, in order to check any modifications for validity as they're made, but I'm sure someone here will set me straight if it's not right. -- Chad Nelson Oak Circle Software, Inc. * * *