Re: [boost] Stacking iterators vs. dataflow

3 Sep 2008

      On Wed, Sep 3, 2008 at 5:36 AM, Phil Endecott
<spam_from_boost_dev@chezphil.org> wrote:
...
I just noticed this in the "lifetime of ranges vs. iterators" thread (which
I've not really been following):
Arno Sch?dl wrote:
...
rng | filtered( funcA )    | filtered( funcB )    | filtered( funcC )    |
filtered( funcD )    | filtered( funcE )
I thought it worth pointing out the similarity, and also the difference,
between this and the proposed dataflow notation.  Here, operator| is being
used like a shell pipe operator.  In dataflow, operator| has a quite
different meaning: it's a vertical line, distributing the output of "rng" to
the inputs of the funcs in parallel.  Confusing, perhaps?
Yes, I was anticipating that there would be possible confusion between
the dataflow library use of "|" for branching and the common use of
"|" for piping.  A different operator could be used in dataflow, if
preferable.
...
Anyway you could
presumably write something like
rng >>= funcA >>= funcB ....
and I would be interested to hear how the two implementations compare.  Is
it true to say that stacked iterators implement a "data pull" style, while
dataflow implements "data push"?
Dataflow.Signals networks are typically implemented as push networks,
but they can also be used for pull-processing:
http://www.dancinghacker.com/code/dataflow/dataflow/signals/introduction/tut...

The direction indicated by >>= aligns with the direction of the signal
(function call), but the data can flow in either way (either sent
forward in the function call argument, or sent back through through
the return value).  So, you could do

rng >>= funcA >>= funcB

 or

funcB >>= funcA >>= rng

depending on how the func and rng components are implemented.
...
I also note that Arno wants to use stacked iterators because this
alternative:
result = fn1( fn2( fn3( fn4( huge_document ) ) ) );
creates large intermediates and requires dynamic allocation.  Again, a
framework that allowed buffering of "sensible size" chunks and potentially
distributed the work between threads could be a good solution.
As far as the dataflow library goes, some sort of a "automatic task
division" library would indeed be great in conjunction with dataflow,
but I see this as orthogonal to dataflow.  Automatic task division
could be useful without dataflow, and dataflow could be useful without
automatic task division.  Is it your opinion that some sort of a task
division strategy would be necessary for the dataflow library to be
useful?

Kind regards,

Stjepan

Re: [boost] Stacking iterators vs. dataflow

Stjepan Rajko