Re: [boost] Re: Multiple files utf8_codecvt_facet.cpp

20 Jul 2004


      Hi Tilman,
...
I'm jumping in, because I am interested in Unicode conversion facets...
...
is there a reason why both program_options and serialization contain
very similar files utf8_codecvt_facet.cpp?
I had a look at the serialization library's converter in
utf8_codecvt_facet.cpp
and noticed that utf8_codecvt_facet_wchar_t::do_in() doesn't check for
non-shortest UTF8-sequences.
Hmmm... I think it's just an omission, and it would be easy to add.
...
There might also be some issues on 
platforms with 16-bit wchar_t (possible overflow).
I suggest using (parts of) the UTF library in the Boost files area to solve
those problems. This could also be another step towards an officially
supported Unicode library... ;-)
http://groups.yahoo.com/group/boost/files/utf/
While I think that library is OK, and last time the author, Alberto Barbati, 
posted on this, he knew about Unicode much more than I, I don't think it's 
good to take that library and add it now to details. Simply put, it will take 
another week until regression tests turn green again. I also don't think 
there's particular difference between different utf8 implementations....

- Volodya

Re: [boost] Re: Multiple files utf8_codecvt_facet.cpp

Vladimir Prus