
Mathias Gaunard wrote:
Le 17/07/2010 17:26, Robert Ramey wrote:
and my personal favorite - dataflow iterators. But I suspect this functionality has probably been covered by Ranges - I'm don't know this, I'm just deducing that from the name.
I don't really understand what dataflow iterators are. Isn't it just a syntactic shortcut in the constructor of an iterator adaptor that calls the base constructor recursively?
That's my name for it. All it is is an iterator adaptor with a templated constructor. This permits me to make new iterators by composiing existing ones. All the work happens at compile time. I've suggested that the interator adaptor have a templated constructor Although I think there was some merit in the idea there wasn't enough interest to do this. In any case, I'm suspecting that ranges might have implemented similar functionality.
Oh - codecvt_utf8. The whole codecvt thing is ripe for a library.
Since wchar_t is potentially 16-bit, utf8_codecvt_facet should do transcoding between UTF-8 and UTF-16, not between UTF-8 and UCS-2 as it does now.
However it doesn't appear that it is possible to do N to M conversion well with a codecvt facet according to what someone said in another thread
I'm just describing what I would like to see. That's all.
I realise that some proposals have been made in this area. I haven't studied them in detail so I don't want to be critical. But, my experience with using the codecvt facility in the serialization library leads me to suspect that it is better than is generally appreciated. In fact, the whole C++ streams iis better than it first appears. The problems is it's sort of obtuse. Some libraries to help support it would help explain and promote this. I'm thinking of things like composable codecvt facets and alternative filebuf implementations. I've always felt the boost streams library got a little off track by not leverage enough on the standard library - a missed opportunity in my opinion.
What I've got as part as my Unicode library is a straight-forward Converter concept and convert iterators/ranges. You define a Converter that describes how to do one step of an arbitrary variable-width N to M conversion with input and output iterators, then you can turn it into an iterator adaptor to convert as you traverse or just apply it in a loop to do the conversion on the whole range eagerly. You can of course apply different Converters one after the other or even compose Converters, albeit the latter has limits since the steps need to play nicely together (i.e. either the Converter needs to be stable by concatenation, or the one applied first needs to have fixed-width output).
That sounds like what I called "dataflow iterators". I did use it for implementing base64 output and lot's of other conversions. The only thing I needed to do this was to add a templated constructor. The other thing I would like to see is a codecvt facet which takes an iterator adaptor a template argument.
I have made a facility to make a codecvt facet out of any Converter, but I suspect it doesn't really work at all since I don't think I deal with "partial" cases correctly, and I haven't come up with a practical way of dealing with Converters that are not stable by concatenation.
This seems to me to be the right idea. But no doubt it's a lot harder than it looks - if it's doable at all.
The fact that you can't have anything other than char/char or wchar_t/char is also a bit limiting.
Note that I used the "dataflow iterator" to implement things that were not 1 to 1. (like base64 output of binary data). Robert Ramey