
on Wed Dec 12 2012, Julian Gonggrijp <j.gonggrijp-AT-gmail.com> wrote:
Dave Abrahams wrote:
If you want such a library to be truly efficient, you should be able to wire together the dataflows at compile-time and not pay any overhead for composing large computations out of smaller ones. The accumulators library uses something like a dataflow model, with the "wiring" done completely at compile-time.
Because of your remark I studied the Accumulators library, and I must say it's very elegant indeed. Dependencies between computations are resolved at compile-time (wow!) so they don't add unnecessary overhead at run-time and everything will be executed in the most efficient order.
However, note the word "order". Boost.Accumulators was clearly designed under the assumption of single-threaded operation and tightly interdependent computations. I believe it might very well be the best possible approach under those assumptions, but when the assumptions don't apply other considerations may call for a different solution.
"Dataflow" is a very general term, but the people who advertise with that word typically mean something more specific, i.e. lock-free multithreading with message passing. The underlying assumptions are more or less opposite to the assumptions in Boost.Accumulators: computations can run concurrently without waiting for each other, as long as they have input available and they can dispose of their output somewhere. If computations can take place /at the same time/ it does not really matter whether connections are made at compile-time or run- time; /waiting/ is eliminated altogether so the building of connections at run-time adds only constant overhead.
It's not just the cost of *building* of dynamic connections but the cost of traversing them. Of course if your sub-computations happen to have large granularity, these costs may be swamped by others, but a truly flexible system would allow you to tune the approach to the problem. The accumulators library was first used in a massively-parallel context with each process running thousands or millions of accumulator sets. Obviously it really depends on your problem domain but that might be a more efficient way to break down the parallel processing. Of course a truly flexible system would (as Jeremiah suggested his design does) allow you to combine static and dynamic wiring (sort of like Xpressive does).
That constant overhead is worth it if accepting it helps you to avoid other problems. Note that the "wires" built by Accumulators at compile-time are /logical/ connections between /types/ that represent operations. Message passing "wires" are /physical/ connections between /threads/.
The accumulators library's wires are just as "physical" as any others, they're just encoded in the instruction stream instead of in mutable data. I don't see the relevance of your emphasis on the word "types."
Instantiating the latter kind of connection at compile-time seems very hard, if not impossible.
No, you could easily use threads with a system like the accumulators framework.
Actually I think the Accumulators framework and lock-free message passing are completely orthogonal.
Exactly. It seems as though you're now contradicting yourself.
One could use a "dataflow" library to connect threads at run-time while composing computations for the individual threads at compile-time with Boost.Accumulators, and get the best of both worlds.
I hope I'm making sense. Please let me know what you think.
I think you eventually made sense :-) -- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost