
I used Ron Garcias utf8 codecvt facet that I down loaded from yahoo files section. I just wrote some tests and make some tweaks as test results for various platforms came it. In includes a manual page in standard boost format. In the same section (http://groups.yahoo.com/group/boost/files/unicode/) there is another file (utf8_transform_iterator) which wraps the same functionality in an standard iterator. Given my experience with the codecvt facet, (I just worked - except for tweaks required for older libraries/compilers) I would strongly recommend that this this other package be examined as well. In fact, I'm sure that it would be easy to transfer the portability tweaks from the current package to the iterator based one. Finally, now that I'm more familiar with codecvt facets, iterators, iterator adaptors, stream buffers, etc. I would much like to see something like the following developed. composable codecvt facets =================== a) codecvt facet which takes an iterator as a type parameter. b) a few basic iterator adaptors c) the ability to compose iterator adaptors to be used as a codecvt facet This would leverage on iterarator adaptors and children, ie. dataflow iterators, ranges, views, fusion? or? to permit one to compose codecvt facets for compression, encryption, and code conversion Just an idea. Robert Ramey "John Maddock" <john@johnmaddock.co.uk> wrote in message news:04a701c4d551$6b584230$45340252@fuji...
I may also need UTF-8 conversion as an implementation detail for boost::filesystem::wpath on Linux and/or POSIX. So I'd really appreciate it if you move ahead with your plan above.
I'm also starting to do UTF-interconversion inside regex, but with
iterators
(http://cvs.sourceforge.net/viewcvs.py/boost/boost/boost/regex/pending/Attic /unicode_iterator.hpp),
if that's any easier for you.
John.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost