
Tilman Kuepper
I'm jumping in, because I am interested in Unicode conversion facets...
is there a reason why both program_options and serialization contain very similar files utf8_codecvt_facet.cpp?
I had a look at the serialization library's converter in utf8_codecvt_facet.cpp and noticed that utf8_codecvt_facet_wchar_t::do_in() doesn't check for non-shortest UTF8-sequences. There might also be some issues on platforms with 16-bit wchar_t (possible overflow).
I suggest using (parts of) the UTF library in the Boost files area to solve those problems. This could also be another step towards an officially supported Unicode library... ;-)
We really, really need utf-8 facets to be part of boost. I've been pleased with Ron Garcia's that have been included in both the serialization and program options libraries. The only problems we've had is working through the fact that there is a lot of variation in which namespaces different libraries have put things. There were a couple of similar issues regarding function signatures as well. Ron's library also had a nice documentation page. All I had to do was write a test - even that was a hassle and its still not totally satisfactory - it gives bogus warnings. I've thought about Dave's suggestion about each of incorporating source code from a common spot. I was resolved to go along as I'm pleased to off load a stick but peripheral(to me) piece of code. Now I'm rethinking this and am not convinced that it's a good idea. I would like to a) make an "official" spot for it header/lib/ etc. b) move Ron's library into that spot. It's quite serviceable c) entertain "applications" for maintainers. Presumable it would be improved, and/or replaced with time. d) subject the next version to a review. This library has been looked at by various people and although several have offered ideas on how it can be improved - no one has found that it is lacking in the implementation as to what it does. And we need it now. Ron Garcia - Are you out there? Anything to add? Robert Ramey