On 18 Mar 2015 at 22:07, Giovanni Piero Deretta wrote:
Regarding the 'atomic option', I was actually thinking of removing it, instead switching to specifying the memory order in get()/set_{value,exception}.
That's a good idea. Default heuristics for avoiding atomic are necessarily going to have to be too conservative.
I complely removed the current then/next implementation in favor of a generic then that works with any waitable. This makes it explicit that you can compose future types (and in fact any waitable) as long as they provide a get_event.
I removed 'next' completely to simplify the implementation. The optimization opportunity to save an allocation on the except case is probably not worth the complexity.
I think the ship has sailed on this at WG21. The present Concurrency TS delights no one, but it is the product of many years of negotiation and compromise. What you see now is highly likely to persist unless you can demonstrate a showstopping technical reason why not (e.g. like std::async).
4. In some contexts a thread-unsafe future/event that saves the cost of atomic operations would be useful (unless it turns out that this cost is negligible on modern CPUs, which I don't think is the case). However, it would be important for the thread-safe and thread-unsafe futures/events to interoperate in a reasonable way. (I'm not sure exactly what reasonable semantics would be.) Obviously there would have to be some caveats, but we would want to avoid another future-island situation.
The required RMW are definitely non neglegible. I tried to keep them at a minimum but they still have a cost. My original future implementation actually had thread safety switch but it would be terribly error prone.
The cost is to the lack of optimisation options, not at the hardware level. Atomic operations, even on weakly ordered systems, are not high when no other CPU core has visibility of that cache line.
Instead I'm studying a way for future to start as simply a deferred synchronous computation, but that can become asynchronous on request of another thread (via work stealing or work requesting for example).
I think that has the maximum chance of being useful to functional programming frameworks too.
Benchmarks (and unit tests) would also be particularly helpful.
I have yet to find a good, fair, small and realistic benchmark for futures. Ideas are welcome.
https://github.com/ned14/c11-permit-object/blob/master/pthread_permit_ speedtest.cpp might have some ideas for you. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/