
I just noticed this in the "lifetime of ranges vs. iterators" thread (which I've not really been following): Arno Sch?dl wrote:
rng | filtered( funcA ) | filtered( funcB ) | filtered( funcC ) | filtered( funcD ) | filtered( funcE )
I thought it worth pointing out the similarity, and also the difference, between this and the proposed dataflow notation. Here, operator| is being used like a shell pipe operator. In dataflow, operator| has a quite different meaning: it's a vertical line, distributing the output of "rng" to the inputs of the funcs in parallel. Confusing, perhaps? Anyway you could presumably write something like rng >>= funcA >>= funcB .... and I would be interested to hear how the two implementations compare. Is it true to say that stacked iterators implement a "data pull" style, while dataflow implements "data push"? I also note that Arno wants to use stacked iterators because this alternative: result = fn1( fn2( fn3( fn4( huge_document ) ) ) ); creates large intermediates and requires dynamic allocation. Again, a framework that allowed buffering of "sensible size" chunks and potentially distributed the work between threads could be a good solution. Phil.

on Wed Sep 03 2008, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
I just noticed this in the "lifetime of ranges vs. iterators" thread (which I've not really been following):
Arno Sch?dl wrote:
rng | filtered( funcA ) | filtered( funcB ) | filtered( funcC ) | filtered( funcD ) | filtered( funcE )
I thought it worth pointing out the similarity, and also the difference, between this and the proposed dataflow notation. Here, operator| is being used like a shell pipe operator. In dataflow, operator| has a quite different meaning: it's a vertical line, distributing the output of "rng" to the inputs of the funcs in parallel.
Very interesting.
Confusing, perhaps?
Perhaps
Anyway you could presumably write something like
rng >>= funcA >>= funcB ....
For which library are you suggesting that notation?
and I would be interested to hear how the two implementations compare. Is it true to say that stacked iterators implement a "data pull" style, while dataflow implements "data push"?
I believe that's correct.
I also note that Arno wants to use stacked iterators because this alternative:
result = fn1( fn2( fn3( fn4( huge_document ) ) ) );
creates large intermediates and requires dynamic allocation.
Yes, that's one of the classic reasons for using iterator adaptors.
Again, a framework that allowed buffering of "sensible size" chunks and potentially distributed the work between threads could be a good solution.
Yes, parallelizing operations on such structures is an interesting problem. I think it may require the imposition of a segmented view over even nonsegmented structures. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

David Abrahams wrote:
on Wed Sep 03 2008, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote: [snip]
rng >>= funcA >>= funcB ....
For which library are you suggesting that notation?
That's the notation currently proposed for Dataflow. I would prefer no operator overloading here, but would accept operator|. Stjepan, is there any precedent (other languages etc.) for >>=?
Is it true to say that stacked iterators implement a "data pull" style, while dataflow implements "data push"?
I believe that's correct.
Hmm, but we can (in principle) stack both input iterator adaptors and output iterator adaptors, can't we? Doesn't that give us both pull and push? And there are also streams, and we have adaptors that convert streams into iterators (and vice-versa?). Is there some common ground (i.e. some shared concepts, or shared notation or vocabulary) here? Phil.

On Wed, Sep 3, 2008 at 10:28 AM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
David Abrahams wrote:
on Wed Sep 03 2008, "Phil Endecott" <spam_from_boost_dev-AT-chezphil.org> wrote:
[snip]
rng >>= funcA >>= funcB ....
For which library are you suggesting that notation?
That's the notation currently proposed for Dataflow. I would prefer no operator overloading here, but would accept operator|. Stjepan, is there any precedent (other languages etc.) for >>=?
Not that I am aware of. It was inspired by the >> operator use in C++, e.g., in >> out1 >> out2, but I needed an operator that is evaluated right-to-left (I think it had to do with making possible certain expressions that use both branching and chaining). Also, I didn't want to clash with the common extraction semantics of >> when in fact a permanent connection was being created. Now, It could be argued that >> or >>= would be more appropriate for branching, as an expression like "in >> out1 >> out2" typically implies that out1 and out2 are both getting their input from in (rather than out1 serving as a filter). And "|" might be more appropriate for chaining because of the piping analogy that you mentioned. Perhaps the use of the two operators should be switched in the Dataflow library. I am open to that. Since they are just syntactic sugar (they just call the connect function), it could even be possible to provide multiple sets of operators, and let the user choose, but I could also see this as being a source of confusion. Stjepan

On Wed, Sep 3, 2008 at 7:55 PM, Stjepan Rajko <stipe@asu.edu> wrote:
On Wed, Sep 3, 2008 at 10:28 AM, Phil Endecott
That's the notation currently proposed for Dataflow. I would prefer no operator overloading here, but would accept operator|. Stjepan, is there any precedent (other languages etc.) for >>=?
Not that I am aware of. It was inspired by the >> operator use in C++, e.g.,
in >> out1 >> out2,
FWIW Haskell uses >>> and >>= for something non completely unrelated to the range usage of |. See http://tinyurl.com/yy9foz -- gpd

Phil Endecott wrote:
Is it true to say that stacked iterators implement a "data pull" style, while dataflow implements "data push"?
I also note that Arno wants to use stacked iterators because this alternative:
result = fn1( fn2( fn3( fn4( huge_document ) ) ) );
creates large intermediates and requires dynamic allocation.
Personally I like to think of iterator adaptors as a form of lazy evaluation, while algorithms are eager evaluation.
Again, a framework that allowed buffering of "sensible size" chunks and potentially distributed the work between threads could be a good solution.
If you perform n transformations, adaptors will give you loop fusion for free. That kind of optimization seems more interesting to me than work distribution and buffering.

Mathias Gaunard wrote:
a framework that allowed buffering of "sensible size" chunks and potentially distributed the work between threads could be a good solution.
If you perform n transformations, adaptors will give you loop fusion for free.
Maybe, subject to the fusion of all the levels' termination tests; I think this is what Dave has been talking about but I'm not knowledgeable about the area.
That kind of optimization seems more interesting to me than work distribution and buffering.
You're lucky if you get to work on "interesting" things, rather than "important" things :-) Here's a practical example: cat email_with_attached_picture | decode_base64 | decode_jpeg | resize_image > /dev/framebuffer How can I convert that shell pipeline into C++? Naive approach: vector<byte> a = read_file("/path/to/email"); vector<byte> b = decode_base64(a); vector<byte> c = decode_jpeg(b); vector<byte> d = resize_image(c); write_file("/dev/framebuffer",d); The problem with that is that I don't start to decode anything until I've read in the whole of the input. The system would be perceptibly faster if the decoding could start as soon as the first data were available. So I can use some sort of iterator adaptor stack or dataflow graph to process the data piece at a time. But it's important that I process it in pieces of the right size. Base64 encoding converts 6 input bytes into 4 output bytes, but it would be a bad idea to read the data from the file 6 bytes at a time; we should probably ask for BUFSZ bytes. libjpeg works in terms of lines, and you can ask it (at runtime, after it has read the file header) how many lines it suggests processing at a time (it's probably the height of the DCT blocks in the image). Obviously that corresponds to a variable number of bytes in the input. I would love to see how readers would approach this problem using the various existing and proposed libraries. Regards, Phil.

On Wed, Sep 3, 2008 at 7:59 PM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
Mathias Gaunard wrote:
a framework that allowed buffering of "sensible size" chunks and potentially distributed the work between threads could be a good solution.
If you perform n transformations, adaptors will give you loop fusion for free.
Maybe, subject to the fusion of all the levels' termination tests; I think this is what Dave has been talking about but I'm not knowledgeable about the area.
That kind of optimization seems more interesting to me than work distribution and buffering.
You're lucky if you get to work on "interesting" things, rather than "important" things :-)
Here's a practical example:
cat email_with_attached_picture | decode_base64 | decode_jpeg | resize_image
/dev/framebuffer
How can I convert that shell pipeline into C++? Naive approach:
vector<byte> a = read_file("/path/to/email"); vector<byte> b = decode_base64(a); vector<byte> c = decode_jpeg(b); vector<byte> d = resize_image(c); write_file("/dev/framebuffer",d);
The problem with that is that I don't start to decode anything until I've read in the whole of the input. The system would be perceptibly faster if the decoding could start as soon as the first data were available.
So I can use some sort of iterator adaptor stack or dataflow graph to process the data piece at a time. But it's important that I process it in pieces of the right size. Base64 encoding converts 6 input bytes into 4 output bytes, but it would be a bad idea to read the data from the file 6 bytes at a time; we should probably ask for BUFSZ bytes. libjpeg works in terms of lines, and you can ask it (at runtime, after it has read the file header) how many lines it suggests processing at a time (it's probably the height of the DCT blocks in the image). Obviously that corresponds to a variable number of bytes in the input.
I would love to see how readers would approach this problem using the various existing and proposed libraries.
Do you really think that the buffering size need to be configurable? Given an appropriate buffering size (a memory page?) you could hide the buffering step inside an interator adaptor, which, instead of producing every N'th value on the fly, would batch the production of enough elements to fill the buffer. David: BTW, I think that you can use exactly the same abstraction used for segmented iterators to expose the buffering capability of a buffered iterator adaptor. "All programming is an exercise in caching." -- Terje Marthisen -- gpd

on Wed Sep 03 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
Given an appropriate buffering size (a memory page?) you could hide the buffering step inside an interator adaptor, which, instead of producing every N'th value on the fly, would batch the production of enough elements to fill the buffer.
David: BTW, I think that you can use exactly the same abstraction used for segmented iterators to expose the buffering capability of a buffered iterator adaptor.
Yes, an iterator with a backing buffer would work great as a segmented iterator. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Thu, Sep 4, 2008 at 1:48 AM, David Abrahams <dave@boostpro.com> wrote:
on Wed Sep 03 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
Given an appropriate buffering size (a memory page?) you could hide the buffering step inside an interator adaptor, which, instead of producing every N'th value on the fly, would batch the production of enough elements to fill the buffer.
David: BTW, I think that you can use exactly the same abstraction used for segmented iterators to expose the buffering capability of a buffered iterator adaptor.
Yes, an iterator with a backing buffer would work great as a segmented iterator.
So, do you think that buffering could be a good approach to help reduce the abstraction overhead of stacked iterator adapters? I think you have to give up a bit of lazyness (from the pull side it is harder to determine exactly how much of the input sequence you want to consume). -- gpd

on Wed Sep 03 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
On Thu, Sep 4, 2008 at 1:48 AM, David Abrahams <dave@boostpro.com> wrote:
on Wed Sep 03 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
Given an appropriate buffering size (a memory page?) you could hide the buffering step inside an interator adaptor, which, instead of producing every N'th value on the fly, would batch the production of enough elements to fill the buffer.
David: BTW, I think that you can use exactly the same abstraction used for segmented iterators to expose the buffering capability of a buffered iterator adaptor.
Yes, an iterator with a backing buffer would work great as a segmented iterator.
So, do you think that buffering could be a good approach to help reduce the abstraction overhead of stacked iterator adapters?
I guess I haven't caught on to your line of thinking. I certainly don't see how buffering could be specifically useful when iterator adaptations are nested. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Thu, Sep 4, 2008 at 4:11 AM, David Abrahams <dave@boostpro.com> wrote:
on Wed Sep 03 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
On Thu, Sep 4, 2008 at 1:48 AM, David Abrahams <dave@boostpro.com> wrote:
on Wed Sep 03 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
Given an appropriate buffering size (a memory page?) you could hide the buffering step inside an interator adaptor, which, instead of producing every N'th value on the fly, would batch the production of enough elements to fill the buffer.
David: BTW, I think that you can use exactly the same abstraction used for segmented iterators to expose the buffering capability of a buffered iterator adaptor.
Yes, an iterator with a backing buffer would work great as a segmented iterator.
So, do you think that buffering could be a good approach to help reduce the abstraction overhead of stacked iterator adapters?
I guess I haven't caught on to your line of thinking. I certainly don't see how buffering could be specifically useful when iterator adaptations are nested.
[this is a bit of a brainstorming, the idea is still quite fuzzy even in my mind, so feel free to ignore me] The idea is: stacked iterators are nice because they are a form of loop fusion, plus are lazy. Problem: the fused loop is inside out, and requires formidable compiler effort to put it in a more straightforward form, especially if you have many levels of nesting. Now, why is loop fusion good? 1) you perform your algorithm in a single pass instead of multiple passes (i.e. you do more stuff at every step). Thus the loop overhead is reduced. 2) as any layer in the stack consumes an elment produced by a layer above it, thus performing all operations for each element in sequence often lead to better cache locality. 3) no need to create intermediate values which implies no need for dynamic memory allocation and most importantly, a smaller working set. Now, 1 is debatable: with modern cpus, just reducing loop overhead is counterproductive, the thigter the loop, the faster is executed (Intel Core 2 cpus for example are especially optimized for loops that fit in ~64 bytes). In theory a very large loop could even fall off the code cache, but that would be a very degenerate case. Anyways, loop fusion is not necessarily a must have. This proposal still allows it in many cases. OTOH 2 and 3 are desirable, we do not want to execute every whole logical pass sequentially, interleaving their execution is fundamental for good performance. But the interleaving granularity is not necessarily a single element. This is where buffering enters into play. Let's make a variant of the copy_n algorithm a customization point for an iterator range. This is the signature: OIter copy_n(Range& range, size_t &n, OIter o) (note that both range and n are passed by non const reference). And this is the default implementation : template<class MultiPassInputRange, class OIter> OIter copy_n(MultiPassInputRange& r, size_t& n, OIter o) { auto begin = r.begin(); for(; n && begin!= end ;++ begin, --n, ++o) o* = *begin; r = MultiPassInputRange(begin, r.end()); return o; } [ there can be faster specializations for random access ranges and for pointer ranges of course] Note that you can implement standard algorithms that do a single traversal of a complete range, in term of copy_n: all copy variants are straight forward; for_each just requires a function_output_iterator; so does accumulate, except that the functor is stateful; count is just a variant of accumulate. You can even some implement mutating algorithms, but I do not think you can no longer call OIter just an output Iterator (*o = *begin can actually change *begin). Those algorithms that do early exits, would actually traverse a constant number of elements more than necessary (at most K-1, where K is the buffering size; strictly speaking I do not think that such an implementation would be conforming because the standard states the exact number of comparison). But what If I want to iterate my stacked ranges with a plain old for loop? Well, you could fill a vector with all the elements of the range stack, but would be wasteful. Better yet is to add a buffer range a the bottom of the stack. When you first access the begin of your range stack, the buffer range uses copy_n to fill in its internal (fixed size) buffer from the range right above it, then it returns elements from the buffer until it reaches the end, at which point it fills it again (here segmented iterators may help reduce the overhead of having to check the for end of buffer at every step). Using a small fixed buffer and periodically filling it is better than filling a while vector because you gain the advantages of the above points 2 and 3. Now, a specific range can specialize copy_n to make it optimal; This is a filter iterator copy_n: void filter_helper(Oiter& o, Pred p, Value &v) { if(p(v)) *o++ = v; } template<....> OIter copy_n(filter_range<Base, Pred>& r, size_t& n, OIter o) { while(n > 0) copy_n(r.base, n, function_output_iterator (bind(filter_helper, ref(o), ref(r.p), _1))) return o; } map_iterator is really trivial and is left as an exercise for the reader. Everything works well as long as you have to iterate over a single range. If you have to iterate over two ranges, you need to buffer one range and perform copy_n on the other. void zip_helper(Oiter o, Iter begin, Value& v) { *o++ = make_tuple(*begin++, v); } template<...> OIter copy_n(zip_range<R1, R2>& r, size_t& n, OIter o) { // We actually want to make this a boost::array and wrap // everything in a while(n>0) std::vector<R1::value_type> v(n); auto n2 = v,size(); copy_n(r.base1, n2, v.begin()); auto begin = v.begin(); // assume both ranges have the same lenght copy_n(r.base2, n, function_output_iterator (bind(zip_helper, ref(o), ref(begin), _1)); } Both zip and filter still have considerable abstraction, but now all the loops are in the right order, and all redundant computation should be factored out. The compiler should be able to do a much better job at optimizing. I'm sure I'm missing plenty of corner cases, and I'm pretty sure the abstraction only works well for linear traversal. It doesn't try to optimize random access. I've completely ignored the possibility of going backward. Also, by buffering ahead, you lose a bit of lazyness (i.e. the user can no longer predict which operations will be performed), but because buffering is bounded, you still have the ability to traverse (part of) infinite ranges. So what do you think? Has it any chance to actually work in practice? -- gpd

on Fri Sep 05 2008, "Giovanni Piero Deretta" <gpderetta-AT-gmail.com> wrote:
So what do you think? Has it any chance to actually work in practice?
Sorry, I couldn't get what you were driving at in that long posting. It seemed like a lot of what you were discussing was based on trying to write one "universal algorithm" through which you could do everything else as a way of reusing optimizations, sort of like mpl::iter_fold_if. But I don't really see how that ties in with buffering. If you could make your ideas more concise, it would help. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Phil Endecott wrote:
Here's a practical example:
cat email_with_attached_picture | decode_base64 | decode_jpeg | resize_image > /dev/framebuffer
How can I convert that shell pipeline into C++? Naive approach:
vector<byte> a = read_file("/path/to/email"); vector<byte> b = decode_base64(a); vector<byte> c = decode_jpeg(b); vector<byte> d = resize_image(c); write_file("/dev/framebuffer",d);
The problem with that is that I don't start to decode anything until I've read in the whole of the input. The system would be perceptibly faster if the decoding could start as soon as the first data were available.
So I can use some sort of iterator adaptor stack or dataflow graph to process the data piece at a time. But it's important that I process it in pieces of the right size. Base64 encoding converts 6 input bytes into 4 output bytes, but it would be a bad idea to read the data from the file 6 bytes at a time; we should probably ask for BUFSZ bytes. libjpeg works in terms of lines, and you can ask it (at runtime, after it has read the file header) how many lines it suggests processing at a time (it's probably the height of the DCT blocks in the image). Obviously that corresponds to a variable number of bytes in the input.
I would love to see how readers would approach this problem using the various existing and proposed libraries.
Since we're talking about systems that combine various approaches to data processing, I should point out my IOChain library once more. http://lists.boost.org/Archives/boost/2008/02/132953.php It does something very similar to what you describe here. IOStreams also does something like this with its filters. Sebastian

On Wed, Sep 3, 2008 at 5:36 AM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
I just noticed this in the "lifetime of ranges vs. iterators" thread (which I've not really been following):
Arno Sch?dl wrote:
rng | filtered( funcA ) | filtered( funcB ) | filtered( funcC ) | filtered( funcD ) | filtered( funcE )
I thought it worth pointing out the similarity, and also the difference, between this and the proposed dataflow notation. Here, operator| is being used like a shell pipe operator. In dataflow, operator| has a quite different meaning: it's a vertical line, distributing the output of "rng" to the inputs of the funcs in parallel. Confusing, perhaps?
Yes, I was anticipating that there would be possible confusion between the dataflow library use of "|" for branching and the common use of "|" for piping. A different operator could be used in dataflow, if preferable.
Anyway you could presumably write something like
rng >>= funcA >>= funcB ....
and I would be interested to hear how the two implementations compare. Is it true to say that stacked iterators implement a "data pull" style, while dataflow implements "data push"?
Dataflow.Signals networks are typically implemented as push networks, but they can also be used for pull-processing: http://www.dancinghacker.com/code/dataflow/dataflow/signals/introduction/tut... The direction indicated by >>= aligns with the direction of the signal (function call), but the data can flow in either way (either sent forward in the function call argument, or sent back through through the return value). So, you could do rng >>= funcA >>= funcB or funcB >>= funcA >>= rng depending on how the func and rng components are implemented.
I also note that Arno wants to use stacked iterators because this alternative:
result = fn1( fn2( fn3( fn4( huge_document ) ) ) );
creates large intermediates and requires dynamic allocation. Again, a framework that allowed buffering of "sensible size" chunks and potentially distributed the work between threads could be a good solution.
As far as the dataflow library goes, some sort of a "automatic task division" library would indeed be great in conjunction with dataflow, but I see this as orthogonal to dataflow. Automatic task division could be useful without dataflow, and dataflow could be useful without automatic task division. Is it your opinion that some sort of a task division strategy would be necessary for the dataflow library to be useful? Kind regards, Stjepan

Stjepan Rajko wrote:
As far as the dataflow library goes, some sort of a "automatic task division" library would indeed be great in conjunction with dataflow, but I see this as orthogonal to dataflow. Automatic task division could be useful without dataflow, and dataflow could be useful without automatic task division. Is it your opinion that some sort of a task division strategy would be necessary for the dataflow library to be useful?
That other task library you are describing sounds a lot like http://www.threadingbuildingblocks.org/ Thanks, Michael Marcin

On Wed, Sep 3, 2008 at 9:30 AM, Michael Marcin <mike.marcin@gmail.com> wrote:
Stjepan Rajko wrote:
As far as the dataflow library goes, some sort of a "automatic task division" library would indeed be great in conjunction with dataflow, but I see this as orthogonal to dataflow. Automatic task division could be useful without dataflow, and dataflow could be useful without automatic task division. Is it your opinion that some sort of a task division strategy would be necessary for the dataflow library to be useful?
That other task library you are describing sounds a lot like http://www.threadingbuildingblocks.org/
Thank you for the reference. If I grab some time I might try to come up with an example of using Threading Building Blocks with the Dataflow library. Best, Stjepan

Stjepan Rajko wrote:
The direction indicated by >>= aligns with the direction of the signal (function call), but the data can flow in either way (either sent forward in the function call argument, or sent back through through the return value). So, you could do
rng >>= funcA >>= funcB
or
funcB >>= funcA >>= rng
depending on how the func and rng components are implemented.
That seems a bit odd; this is a DATA-flow library, so I would expect the direction of the symbol to describe the direction of the data flow. And isn't the data passing method a detail of the implementation (in this case Signals) that your Generic layer should be hiding? But I'm not sure I understand you; here rng is a data source and (surely) it supplies values via its return value; how can it be implemented to supply values via a parameter?
As far as the dataflow library goes, some sort of a "automatic task division" library would indeed be great in conjunction with dataflow, but I see this as orthogonal to dataflow.
It doesn't have to be _automatic_ (i.e. runtime) task division; just some way of running some components in their own threads. The thread-safe-signals work that you linked to before may be sufficient for this, though I don't know enough about Boost.Signals to fully understand it. Phil.

On Wed, Sep 3, 2008 at 11:13 AM, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
Stjepan Rajko wrote:
The direction indicated by >>= aligns with the direction of the signal (function call), but the data can flow in either way (either sent forward in the function call argument, or sent back through through the return value). So, you could do
rng >>= funcA >>= funcB
or
funcB >>= funcA >>= rng
depending on how the func and rng components are implemented.
That seems a bit odd; this is a DATA-flow library, so I would expect the direction of the symbol to describe the direction of the data flow. And isn't the data passing method a detail of the implementation (in this case Signals) that your Generic layer should be hiding?
The data flow with signals is bidirectional in the general case. For example, float fn(int x, int &y, int &z) { y = x + y; z = x + 1; return y*2; } ... has the following flow of data: * data flows in through x and y (and z, but there it is ignored) * data flows out through y, z and the return value. Since Dataflow.Signals uses Boost.Signals which uses function calls, Dataflow.Signals is inherently a bidirectional framework. To make things simpler, most of the Dataflow.Signals documentation (and indeed, many of the provided components) is geared towards a simplified push-based approach where the data flows left to right (with left and right as in connect(a,b) or a >>= b). But, as indicated above, data could just as easily flow right to left, or in both directions, in which case a >>= b becomes misleading. I could provide <<= as well, so the user could use a <<= b to indicate data flowing right to left. Or, the user could just always use connect(a,b). I think that when the flow of data becomes complicated (as in the fn function above), the dataflow programming paradigm has a good chance of turning into a headache and should probably not be used. That is why I tend to limit the discussion in the documentation to purely push-based or purely pull-based networks in the case of Dataflow.Signals.
But I'm not sure I understand you; here rng is a data source and (surely) it supplies values via its return value; how can it be implemented to supply values via a parameter?
I'm sorry, I have a tendency to throw up syntax and neglect to explain the semantics :-(. I was assuming that rng in rng >>= funcA >>= funcB was a Boost.Dataflow component, as the input component in this example: http://www.dancinghacker.com/code/dataflow/dataflow/introduction/dataflow.ht... ...or the generator (used in combination with timer) here: http://www.dancinghacker.com/code/dataflow/dataflow/signals/introduction/exa...
As far as the dataflow library goes, some sort of a "automatic task division" library would indeed be great in conjunction with dataflow, but I see this as orthogonal to dataflow.
It doesn't have to be _automatic_ (i.e. runtime) task division; just some way of running some components in their own threads. The thread-safe-signals work that you linked to before may be sufficient for this, though I don't know enough about Boost.Signals to fully understand it.
I will try to expand that example and add some documentation / comments to make it clearer. It would be a good thing to discuss as a part of the review. Kind regards, Stjepan

Note that the serialization library "Dataflow iterators" use "stacked iterators" created at compile time. fn1< fn2< fn3< fn4<T> > >
(iterator on huge document)
You might find it interesting to look at this. Robert Ramey Phil Endecott wrote:
I also note that Arno wants to use stacked iterators because this alternative: result = fn1( fn2( fn3( fn4( huge_document ) ) ) );
creates large intermediates and requires dynamic allocation. Again, a framework that allowed buffering of "sensible size" chunks and potentially distributed the work between threads could be a good solution.
participants (8)
-
David Abrahams
-
Giovanni Piero Deretta
-
Mathias Gaunard
-
Michael Marcin
-
Phil Endecott
-
Robert Ramey
-
Sebastian Redl
-
Stjepan Rajko