Dear Boost,
Given Hartmut's feedback from my previous attempt at a non-allocating
future-promise implementation, I now have a second, much more generic
demonstration prototype which is based on heterogenous sequences of
constexpr functional calls. I think this design may have huge potential.
Consider some heterogenous sequence of callables C { f1, f2, f3, f4, ... }
such that fN(... f4(f3(f2(f1(Args...))))) -> R i.e. if you call the
heterogenous sequence with arguments Args..., f1(args...) will be called
and fed to f2() which then is fed to f3() up to fN() whose return type is
R. If I understand proposed Boost.Hana correctly, these potentially
constexpr sequences of heterogeneous callables are the main bridge between
pure compile-time heterogeneous type sequences and doing something with
them in real code as they can decay on demand into actual code when needed.
And for proposed Boost.Expected, its monadic programming sequences bind(),
map() and then() all functionally operate on some expected, consuming
an instance and returning new instances, so these monadic sequences are
also potentially constexpr sequences of heterogenous callables.
Now, what new basic_future-basic_promise will do is to allow you to insert
a suspend-resume of execution into some functional sequence C, so
essentially:
split<idx>(C) => [ basic_promise<X>, basic_future<Y> ]
... where X is some split front part of C and Y is some back part of C. You
can now execute the first part of C via the call operator:
basic_promise<X>(Args...)
Execution of the C0...Cn making up X occurs, but instead of continuing into
Y it stores the result of Cn and makes that available to the corresponding
basic_future. At some later point you can resume the execution of C:
basic_future<Y>() -> R
... and thus complete the execution of the functional sequence C, returning
the same value R as invoking the original functional sequence C with the
same input parameters.
For those who prefer to understand theory via code, here is some example
C++ 14 code:
// Let this be an arbitrarily long sequence of functional callables
auto ct=make_continuations([](auto x){return 5+x;}, [](auto x){ return
x+6;});
{
// Make a future promise pair for if we called c(int)
auto fp=ct.suspend_resume<1, int>();
// Execute promise
fp.first(1);
// Execute future
b=std::move(fp.second)();
}
{
// Make a future promise pair for if we called c(double)
auto fp=ct.suspend_resume<1, double>();
// Execute promise
fp.first(1.0);
// Execute future
d=std::move(fp.second)();
}
// b=12, d=12.0
There are a multitude of uses for this very fundamental pattern. Let us try
implementing something closely resembling the Concurrency TS's dynamic
continuations support .then(launch, callable). Let us define this
functional continuation:
template<class T> struct then_continuation
{
std::vector> _conts;
template<class U> auto then(U &&c)
{
typedef decltype(c(std::declval<T>())) result_type;
promise p;
future ret(p.get_future());
// Unfortunately libstdc++'s std::function currently always tries to
copy lambda types
//_conts.emplace_back([p=std::move(p), c=std::forward<U>(c)](T
v){p.set_value(c(v));});
_conts.push_back([p =
std::make_shared>(std::move(p)),
c=std::forward<U>(c)](T v){p->set_value(c(v));});
return ret;
}
BOOST_CONSTEXPR T operator()(T v) const
{
for(auto &i : _conts)
i(v);
return v;
}
};
All this continuation does is provide a .then(callable) which stores those
callables into a std::vector and its operator() executes the previously
stored callables.
Now let us define a std::promise lookalike, with its main deficiencies
being a total lack of exception_ptr support (one would use expected
as the transport type for that) and of course any thread safety:
template<class T> class promise : public
basic_promise ...
basic_promise takes three template parameters, the type of the front split
of the continuations, the transported type between the two halves of
continuations, and the type of the back split of the continuations. Here,
we are defining promise as transporting a type T, and two then_continuation
continuations exist before and after the transport of the type T.
To our future<T> lookalike we now add this:
template<class U> auto then(launch policy, U &&c)
{
if(policy==launch::deferred)
return Base::_after.then(std::forward<U>(c));
else if(Base::_other)
return Base::_other->_before.then(std::forward<U>(c));
}
So, if the .then() dynamic continuation is deferred, push it into the
latter then_continuation, else push it into the former.
And voila, you can now do this very similar to Concurrency TS code:
promise<int> p;
future<int> f(p.get_future());
future<double> f2(f.then(launch(), [](int a){return a+1.0;}));
p.set_value(5);
return f2.get();
And you get back 6.0, as expected.
However here is where things get neato. How many instructions do you think
GCC 4.9 generates for this sequence?
promise<int> p;
future<int> f(p.get_future());
p.set_value(5);
return f.get();
In traditional future-promise, that is an operation perhaps consuming
thousands of CPU cycles and at least one memory allocation and
deallocation. Well, this is the assembler produced by the above by GCC 4.9:
_Z19test_futurepromise4v:
movq $0, -88(%rsp)
movq $0, -80(%rsp)
movl $5, %eax
movq $0, -72(%rsp)
movq $0, -48(%rsp)
movq $0, -40(%rsp)
movq $0, -32(%rsp)
movl $5, -56(%rsp)
movq $0, -64(%rsp)
movq $0, -24(%rsp)
ret
Believe it or not, this is actually bad output - there is a bug in GCC's
optimiser and it should in fact be outputting a single instruction rather
than writing zeros into the stack for no purpose - it seems to get confused
by the double declation of a std::vector which is never used, because
having just one std::vector produces the ideal output but having two
produces the above. However clang 3.4's optimiser does far worse, and a bug
there with how non-immediately defined types are treated causes complete
expansion of code in its entirety, even if you remove all std::vector's
(supplying any type not immediately declared in the same function breaks
the optimiser for some unknown reason. I haven't tried clang 3.5 yet, it
may be fixed).
Other very interesting policy continuations which can be statically mixed
into custom future promise types include calling some application specific
callback without needing to indirect via a pointer to a function. This lets
you combine future promise with invoking callbacks, useful for legacy code.
The idea behind this suite of future-promise fundamental parts is that they
can be mixed at will into bespoke future-ish promise-ish application
specific future promise types. Because they all share the same foundations,
a when_any() or when_all() can accept heterogenous future types - and I
don't mean future<int> with future<double>, rather I mean afio::future<>
with qt::future<int>, that sort of heterogenity.
Another very interesting idea is combining these with signals and slots, so
via custom continuations you can keep slots and fire signals in sync with
future promises. This design I think shows a lot of potential.
Anyway, I feel very happy with this design, and something along these lines
is where I shall proceed next. The eventual intention is for these future
promises to become the default Boost.Thread future promises when C++11 is
available, though this will be up to a year from now. I won't be able to
support future-promise custom allocators, but I certainly should be able to
completely avoid memory allocation and sometimes the generation of any code
at all with this design.
The demonstration prototype code is at
https://github.com/ned14/boost.spinlock/blob/expected_future/test_constexpr_
future_promise2.cpp for anyone interested. It needs a C++ 14 compiler, so a
minimum of libstdc++ 4.9 and either GCC 4.9 or clang 3.4 or better.
Comments are welcome.
Niall