Re: [boost] [thread] Alternate future implementation and future islands.

19 Mar 2015

      On 19 Mar 2015 11:09, "Niall Douglas" <s_sourceforge@nedprod.com> wrote:
...
On 17 Mar 2015 at 21:42, Giovanni Piero Deretta wrote:.
...
* The wait strategy is not part of the future implementation (although
a default is provided). future::get, future::wait, wait_all and
wait_any are parametrized by the wait strategy.
* The design aims to streamline and make as fast as possible future
and promise at the cost of a slower shared_future (although there is
room for improvement).
Your future still allocates memory, and is therefore costing about
1000 CPU cycles.
1000 clock cycles seems excessive with a good malloc implementation.

Anyways, the plan is to add support to a custom allocator. I do not think
you can realistically have a non allocating future *in the general case* (
you might optimise some cases of course).
...
That is not "as fast as possible". The use of
std::atomic also prevents the compiler optimiser from eliding the
future implementation entirely, because atomic always forces code to
be generated. As much as futures today are big heavy things,
tomorrow's C++17 futures especially resumable function integrated
ones must be completely elidable, then the compiler can completely
eliminate and/or collapse a resumable function call or sequence of
such calls where appropriate. If you have an atomic in there, it has
no choice but to do the atomic calls needlessly.
I understand what you are aiming at, but I think that the elidability is
orthogonal. Right now I'm focusing on making the actual synchronisation
fast and composable in the scenario where the program has committed to make
a computation async.
...
I think memory allocation is unavoidable for shared_future, or at
least any realistically close to the STL implementation of one. But a
future I think can be both STL compliant and never allocate memory
and be optimisable out of existence.
...
* The wait strategy only deals with 'event' objects, which act as a
bridge between the future and the promise.
The event object is really my the core of my contribution; it can be
thought of as the essence of future<void>::then; alternatively it can
be seen as a pure user space synchronization primitive.
Exactly as my C11 permit object is. Except mine allows C code and C++
code to interoperate and compose waits together.
Not at all. I admit not having studied permit in detail (the doc size is
pretty daunting) but as far as I can tell the waiting thread will block in
the kernel.

It provides a variety of ways on how to block, the user can't add more.

With event the decision of how to block ( or even whether to block at all)
is done by the consumer and most importantly it can be delayed till the
last moment.

You can think of event as the bridge between the signal side and wait side
of permit, or alternatively as a future<void> which only supports then (no
get or wait).

Regarding interfacing with a C event source , it can be done trivially
right now now as long   as it provides a callback (function pointer +
context pointer) API, which is very common already. No need to wait for
existing libraries to embrace a new synchronisation object (cue xkcd
vignette about standards :) )

Allowing C to wait for events would require replacing the waiter vtable
with the c equivalent, but is not something I care much right now.
...
...
Other features of this implementation:
* Other than in the blocking strategy itself, the future and promise
implementation have no sources of blocking (no mutexes, not even spin
locks).
* The shared state is not reference counted.
To demonstrate the generality of the waiting concept, I have
implemented a few waiter objects.
* cv_waiter: this is simply a waiter on top of an std::mutex + cond
var.
...
...
* futex_waiter (linux): an atomic counter + a futex, possibly more
efficient than the cv_waiter
* sem_waiter (posix): a waiter implemented on top of a posix
semaphore. More   portable than the futex waiter and possibly more
efficient than the cv_waiter
* fd_waiter(linux, possibly posix): a waiter implemented on top of
linux eventfd (for portability it can also be implemented on top of a
pipe); you can use select with futures!
* task_waiter: a completely userspace based coroutine waiter which
switches to another coroutine on wait and resumes the original
coroutine on signal.
* scheduler_waiter: another coroutine based waiter, but on top of an
userspace task scheduler. On wait switch to the next ready task, on
signal enqueue the waiting task to the back of the ready queue.
I know that there is a plan to reimplement a new version of
boost::threads, hopefully this implementation can contribute a few
ideas.
My current best understanding of Vicente's plans is that each thread
has a thread local condvar. The sole cause of a thread sleeping,
apart from i/o, is on that thread local condvar. One therefore has a
runtime which keeps a registry of all thread local condvars, and can
therefore deduce the correct thread local condvars to wake when
implementing a unified wait system also compatible with
Fibers/resumable functions.
That doesn't work if a program wants to block for example in select  or
spin on a memory location, or on an hardware register, or wait for a
signal, or interoperate with a different userspace thread library, or some
other event queue (asio, qt or whatever) etc. and still also wait for a
future. Well, you can use future::then, but it has overhead.

-- gpd