Non-allocating constexpr functional continuations-based future-promise
Dear Boost, Given Hartmut's feedback from my previous attempt at a non-allocating future-promise implementation, I now have a second, much more generic demonstration prototype which is based on heterogenous sequences of constexpr functional calls. I think this design may have huge potential. Consider some heterogenous sequence of callables C { f1, f2, f3, f4, ... } such that fN(... f4(f3(f2(f1(Args...))))) -> R i.e. if you call the heterogenous sequence with arguments Args..., f1(args...) will be called and fed to f2() which then is fed to f3() up to fN() whose return type is R. If I understand proposed Boost.Hana correctly, these potentially constexpr sequences of heterogeneous callables are the main bridge between pure compile-time heterogeneous type sequences and doing something with them in real code as they can decay on demand into actual code when needed. And for proposed Boost.Expected, its monadic programming sequences bind(), map() and then() all functionally operate on some expected<T, E>, consuming an instance and returning new instances, so these monadic sequences are also potentially constexpr sequences of heterogenous callables. Now, what new basic_future-basic_promise will do is to allow you to insert a suspend-resume of execution into some functional sequence C, so essentially: split<idx>(C) => [ basic_promise<X>, basic_future<Y> ] ... where X is some split front part of C and Y is some back part of C. You can now execute the first part of C via the call operator: basic_promise<X>(Args...) Execution of the C0...Cn making up X occurs, but instead of continuing into Y it stores the result of Cn and makes that available to the corresponding basic_future. At some later point you can resume the execution of C: basic_future<Y>() -> R ... and thus complete the execution of the functional sequence C, returning the same value R as invoking the original functional sequence C with the same input parameters. For those who prefer to understand theory via code, here is some example C++ 14 code: // Let this be an arbitrarily long sequence of functional callables auto ct=make_continuations([](auto x){return 5+x;}, [](auto x){ return x+6;}); { // Make a future promise pair for if we called c(int) auto fp=ct.suspend_resume<1, int>(); // Execute promise fp.first(1); // Execute future b=std::move(fp.second)(); } { // Make a future promise pair for if we called c(double) auto fp=ct.suspend_resume<1, double>(); // Execute promise fp.first(1.0); // Execute future d=std::move(fp.second)(); } // b=12, d=12.0 There are a multitude of uses for this very fundamental pattern. Let us try implementing something closely resembling the Concurrency TS's dynamic continuations support .then(launch, callable). Let us define this functional continuation: template<class T> struct then_continuation { std::vector<std::function<void(T)>> _conts; template<class U> auto then(U &&c) { typedef decltype(c(std::declval<T>())) result_type; promise<result_type> p; future<result_type> ret(p.get_future()); // Unfortunately libstdc++'s std::function currently always tries to copy lambda types //_conts.emplace_back([p=std::move(p), c=std::forward<U>(c)](T v){p.set_value(c(v));}); _conts.push_back([p = std::make_shared<promise<result_type>>(std::move(p)), c=std::forward<U>(c)](T v){p->set_value(c(v));}); return ret; } BOOST_CONSTEXPR T operator()(T v) const { for(auto &i : _conts) i(v); return v; } }; All this continuation does is provide a .then(callable) which stores those callables into a std::vector and its operator() executes the previously stored callables. Now let us define a std::promise lookalike, with its main deficiencies being a total lack of exception_ptr support (one would use expected<T, E> as the transport type for that) and of course any thread safety: template<class T> class promise : public basic_promise<detail::then_continuation<T>, T, detail::then_continuation<T>> ... basic_promise takes three template parameters, the type of the front split of the continuations, the transported type between the two halves of continuations, and the type of the back split of the continuations. Here, we are defining promise as transporting a type T, and two then_continuation continuations exist before and after the transport of the type T. To our future<T> lookalike we now add this: template<class U> auto then(launch policy, U &&c) { if(policy==launch::deferred) return Base::_after.then(std::forward<U>(c)); else if(Base::_other) return Base::_other->_before.then(std::forward<U>(c)); } So, if the .then() dynamic continuation is deferred, push it into the latter then_continuation, else push it into the former. And voila, you can now do this very similar to Concurrency TS code: promise<int> p; future<int> f(p.get_future()); future<double> f2(f.then(launch(), [](int a){return a+1.0;})); p.set_value(5); return f2.get(); And you get back 6.0, as expected. However here is where things get neato. How many instructions do you think GCC 4.9 generates for this sequence? promise<int> p; future<int> f(p.get_future()); p.set_value(5); return f.get(); In traditional future-promise, that is an operation perhaps consuming thousands of CPU cycles and at least one memory allocation and deallocation. Well, this is the assembler produced by the above by GCC 4.9: _Z19test_futurepromise4v: movq $0, -88(%rsp) movq $0, -80(%rsp) movl $5, %eax movq $0, -72(%rsp) movq $0, -48(%rsp) movq $0, -40(%rsp) movq $0, -32(%rsp) movl $5, -56(%rsp) movq $0, -64(%rsp) movq $0, -24(%rsp) ret Believe it or not, this is actually bad output - there is a bug in GCC's optimiser and it should in fact be outputting a single instruction rather than writing zeros into the stack for no purpose - it seems to get confused by the double declation of a std::vector which is never used, because having just one std::vector produces the ideal output but having two produces the above. However clang 3.4's optimiser does far worse, and a bug there with how non-immediately defined types are treated causes complete expansion of code in its entirety, even if you remove all std::vector's (supplying any type not immediately declared in the same function breaks the optimiser for some unknown reason. I haven't tried clang 3.5 yet, it may be fixed). Other very interesting policy continuations which can be statically mixed into custom future promise types include calling some application specific callback without needing to indirect via a pointer to a function. This lets you combine future promise with invoking callbacks, useful for legacy code. The idea behind this suite of future-promise fundamental parts is that they can be mixed at will into bespoke future-ish promise-ish application specific future promise types. Because they all share the same foundations, a when_any() or when_all() can accept heterogenous future types - and I don't mean future<int> with future<double>, rather I mean afio::future<> with qt::future<int>, that sort of heterogenity. Another very interesting idea is combining these with signals and slots, so via custom continuations you can keep slots and fire signals in sync with future promises. This design I think shows a lot of potential. Anyway, I feel very happy with this design, and something along these lines is where I shall proceed next. The eventual intention is for these future promises to become the default Boost.Thread future promises when C++11 is available, though this will be up to a year from now. I won't be able to support future-promise custom allocators, but I certainly should be able to completely avoid memory allocation and sometimes the generation of any code at all with this design. The demonstration prototype code is at https://github.com/ned14/boost.spinlock/blob/expected_future/test_constexpr_ future_promise2.cpp for anyone interested. It needs a C++ 14 compiler, so a minimum of libstdc++ 4.9 and either GCC 4.9 or clang 3.4 or better. Comments are welcome. Niall
participants (1)
-
Niall Douglas