Re: [boost] [thread] Alternate future implementation and future islands.

19 Mar 2015

      On 17 Mar 2015 at 21:42, Giovanni Piero Deretta wrote:
...
I just wanted to share my attempt on a solution on the problem. The
code can be found at
https://github.com/gpderetta/libtask/blob/master/future.hpp (other
interesting files are shared_future.hpp event.hpp and
tests/future_test.cpp). It is an implementation a subset of the
current boost and std future interface. In particular it has promise,
future, shared_future, future::then, wait_any, wait_all. Most
important missing piece are timed waits (for lack of, ahem, time, but
should be easy to implement). The implementation requires c++14 and is
only lightly tested, it should be treated as a proof of concept, not
production-ready code.
Firstly, thanks for your input, especially one supported with code.
...
* The wait strategy is not part of the future implementation (although
a default is provided). future::get, future::wait, wait_all and
wait_any are parametrized by the wait strategy.
* The design aims to streamline and make as fast as possible future
and promise at the cost of a slower shared_future (although there is
room for improvement).
Your future still allocates memory, and is therefore costing about 
1000 CPU cycles. That is not "as fast as possible". The use of 
std::atomic also prevents the compiler optimiser from eliding the 
future implementation entirely, because atomic always forces code to 
be generated. As much as futures today are big heavy things, 
tomorrow's C++17 futures especially resumable function integrated 
ones must be completely elidable, then the compiler can completely 
eliminate and/or collapse a resumable function call or sequence of 
such calls where appropriate. If you have an atomic in there, it has 
no choice but to do the atomic calls needlessly.

I think memory allocation is unavoidable for shared_future, or at 
least any realistically close to the STL implementation of one. But a 
future I think can be both STL compliant and never allocate memory 
and be optimisable out of existence.
...
* The wait strategy only deals with 'event' objects, which act as a
bridge between the future and the promise.
The event object is really my the core of my contribution; it can be
thought of as the essence of future<void>::then; alternatively it can
be seen as a pure user space synchronization primitive.
Exactly as my C11 permit object is. Except mine allows C code and C++ 
code to interoperate and compose waits together.
...
Other features of this implementation:
* Other than in the blocking strategy itself, the future and promise
implementation have no sources of blocking (no mutexes, not even spin
locks).
* The shared state is not reference counted.
To demonstrate the generality of the waiting concept, I have
implemented a few waiter objects.
* cv_waiter: this is simply a waiter on top of an std::mutex + cond var.
* futex_waiter (linux): an atomic counter + a futex, possibly more
efficient than the cv_waiter
* sem_waiter (posix): a waiter implemented on top of a posix
semaphore. More   portable than the futex waiter and possibly more
efficient than the cv_waiter
* fd_waiter(linux, possibly posix): a waiter implemented on top of
linux eventfd (for portability it can also be implemented on top of a
pipe); you can use select with futures!
* task_waiter: a completely userspace based coroutine waiter which
switches to another coroutine on wait and resumes the original
coroutine on signal.
* scheduler_waiter: another coroutine based waiter, but on top of an
userspace task scheduler. On wait switch to the next ready task, on
signal enqueue the waiting task to the back of the ready queue.
I know that there is a plan to reimplement a new version of
boost::threads, hopefully this implementation can contribute a few
ideas.
My current best understanding of Vicente's plans is that each thread 
has a thread local condvar. The sole cause of a thread sleeping, 
apart from i/o, is on that thread local condvar. One therefore has a 
runtime which keeps a registry of all thread local condvars, and can 
therefore deduce the correct thread local condvars to wake when 
implementing a unified wait system also compatible with 
Fibers/resumable functions.

I haven't played yet with proposed Boost.Fiber (this summer after C++ 
Now I will), but I expect it surely uses a similar runtime. Ideally 
I'd like Thread v5's runtime to be equal to the Fiber runtime if this 
is true, but as I mentioned I haven't played with it yet.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/

Re: [boost] [thread] Alternate future implementation and future islands.

Niall Douglas