[boost] Re: Functions as Filters (was Program Reuse...)

13 Jan 2005

      Daniel James wrote:
...
Jonathan Turkanis wrote:
...
It's not that I don't see the benefit of lower complexity; I don't
really see the lower complexity. ;-)
Simon Tatham provides a nice motivation for this kind of thing at:
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
Here's a rough (untested) translation of his run-length decoding
example:
void decompress(std::istream& in, std::ostream& out)
   {
     char c;
     while(in.get(c))
     {
       if(c == 0xFF) {
         int len = in.get();
         if(!in || !in.get(c)) // Return an error.
         while(len--) out.put(c);
       }
       else {
         out.put(c);
       }
     }
   }
And as a filter:
struct toupper_filter : input_filter {
       int repeat_char;
       int repeat_length;
toupper_filter() : repeat_char(0), repeat_length(0) {}
template<typename Source>
       int get(Source& src)
       {
         if(repeat_length > 0) {
           repeat_length--;
           return repeat_char;
         }
         else {
           char c = boost::io::get(src);
           if(c == 0xFF) {
             repeat_length = boost::io::get(src);
             repeat_char = boost::io::get(src);
             repeat_length--;
             return repeat_char;
           }
           else {
             return c;
           }
         }
       }
    };
And that's a fairly simple example. (Sorry if you have a better way to
do this, I haven't really looked at the library).
Nice example! That's the type of evidence I was hoping Christopher would
produce. Refering to one of my old messages, which doesn't seem to be archived
yet, you have written a filter in form [B] (using streams as function
arguments). I think [C] (using Sources and Sinks) would be sufficient here. So
I'm leaning toward allowing filters along the lines of [C].

I guess I should mention that this was first suggested by Rob Stewart in a
private email during the iostreams review:

<email>

Jonathan turkanis wrote:
...
Rob Stewart wrote:
...
Can we simplify all of this to the following?
template <typename Source, typename Sink>
   unspecified-status-indicator
   filter(Source & in, Sink & out);
IOW, if the framework provided both the source and the sink, the
call to filter() would cause data to flow from in to out.
Whether the data flow is input or output for the entire stream
doesn't matter.  The filter just knows its own source and sink.
The source and sink could even be objects in your library that
wrap a Device and hook into the framework mechanisms to move data
along, if you need to intervene in any way.  That's particularly
useful for async I/O.
...
I've thought of this too, and I like it. This could be called a
CoprocessFilter.
</email>
...
Daniel James wrote:
Jonathan Turkanis wrote:
...
Your version requires that an entire stream
of data be processed at once -- leading to poor memory use -- and
doesn't work at all for streams which have no natural end.
Not necessarily, he could use threads or fibres with pipes, although
that's quite expensive. That's why I was playing around with using a
Duff's Device style switch statement for implementing coroutines.
I think I already answered this. Standard input streams don't have a way to
indicate that input is temporarily unavailable. Therefore if the producer and
consumer are operating in separate threads and the consumer gets ahead of the
producer, a false EOF will be detected.

Version [C] will not suffer from this problem, because the final version of the
filter and device concepts will provide a way to distinguish EOF from EAGAIN.
Unfortunately, there's no way to retrofit this onto the standard streams or
stream buffers, since they do not recognize this distinction.
...
Daniel
Jonathan