[coroutine][review] Very Late Review

Hi all, This review is way after the deadline, so feel free to ignore my vote, but I hope that the feedback will be useful anyway. First of all, I vote to accept the library with the current interface. I see that Oliver agreed to significant interface changes which I hope these will be subject to a minireview before final acceptance. - What is your evaluation of the documentation? Adequate for someone that already has some knowledge of coroutines and generators, but it lacks an introduction on the topic. Like other reviewers I also suggest that some of the more interesting examples should be presented and discussed in the documentation. Also, a few topics should be discussed more in details: * Why thread local storage cannot be referenced in threaded applications with a brief descritption of what is, IMHO, a GCC TLS bug. * What is the exact meaning of the preserve FPU flags (which, as far as I understand, it really makes it optional to preserve the FPU context), not the FPU registers themselves - What is your evaluation of the potential usefulness of the library? Very useful. This library is long due in Boost. - Did you try to use the library? With what compiler? Did you have any problems? No, I did not try. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? In depth study. - Are you knowledgeable about the problem domain? yes, I have implemented similar functionality before. - What is your evaluation of the design and implementation? The interface as proposed for review is very close to my original boost.coroutine interface and as such I of course think is adequate. I think that the interface can be improved (which is one of the reasons that i never submitted my library for review). Since the beginning of the review period, a lot of interface changes have been proposed, some of which I like, some I don't. About the design: I do not believe that the coroutine should be parametrized on do_unwind (insert something about not parametrising on behaviour here :) ). The behaviour of boost.coroutine should be explicit from the type and one shouldn't have suprises when unwinding a coroutine. Please chose to either assert or explicit unwind on destruction. I prefer the former, as per std::function, but would prefer the latter to parametrization. I also dislike the preserve_fpu flag, but I understand what it is there for and could live with it. Still it should be harder to misuse: make it an enum, and move it last in the constructor parementer list, so that specifying a stateful allocator (BTW, I assume these are supported), shouldn't require to specify this flag. I like the idea that has been proposed during the review that the result of a coroutine should be retrivable at any time via a get method. I dislike the proposed optional<T> result for operator(). If the previous delayed result functionality is implemented, operator() should just return *this, so that the subsequent get() call can be chained or the result tested for non empty. This change has the potential of deeply affecting the interface, so I would really like to see it subject to a mini review before acceptance. The coroutine object has both the concept of a not-a-coroutine and of a completed coroutine. This is confusing, and should be conflated to 'empty' state, analogous to an empty std::function. In addition to simplifying the documentation and the interface, it should also slightly improve the implementation. Operator bool should return !empty as per std::function. It has been suggested that coroutine::self be removed an have a thread specific object. I strongly disagree with this option. First of all it will make it too easy to break type safety: typedef this_coroutine<int(int)> this_coro; typedef this_coroutine<void()> that_coro; coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. }); Of course you can find ways to break even code with the explicit self syntax, but the above is, IMHO, just too dangerous. In practice, one need to pass the signature to any code that wants to yield anyway, so might as well pass the self object around, at least one can rely on template type deduction. Also, one does not really want arbitrary code to yield out of the coroutine as this can be dangerous (think of code between yield statements as a sort of critical section) as invariants can be broken or mutices held. Instead I suggest doing away with a separate self type and instead making the first parameter to the coroutine functor a full fledged coroutine, with simmetric signature: coroutine<int(void)> my_coroutine([](coroutine<void(int)> caller) { caller(10); }); int i = my_coroutine(); This means that the parameter no longer represent the coroutine itself, but, more logically, the caller coroutine. This makes very obvious the analogy between coroutines and channels. It also blurs the division between symmetric and asymmetric coroutines. Bonus points if you had a function that infers the coroutine singnature from the coroutine functor operator() (I like to call this function callcc for obvious reasons [ http://en.wikipedia.org/wiki/Call-with-current-continuation]): auto my_coroutine = callcc([](coroutine<void(int)> caller) { ... } ); This would obviously work only if the functor is monomorphic, but it interacts very nicely with C++11 lambdas. I do not like the separation between coroutines and generators. Instead the generator class should be dropped. The reason that I had generator as a separate class in the original coroutine SoC is because I wanted to give it iterator semantics (this also required internally using a reference counted pointer to make the iterator copiable, which I disliked). Instead I propose to make every coroutine<T()> fulfill the the input range concept and every coroutine<void(T)> fulfill the output range concept (this is a functionality not covered by generators). In practice it means that for these classes of coroutines, begin and end are defined returning appropriate iterator (i.e. some very thin layers around the coroutine itself). As a range is required to survive its iterator anyway, the iterators can simply use plain pointers to the coroutine. The last suggestion works very well if coupled with the suggestion of making 'self' a full coroutine: std::vector<int> a = { ... }; std::vector<int> b = { ... }; std::cout << "The result of 'merge(a,b) is:\n"; for(auto x : callcc([&](coroutine<void(int)> c) { std::merge(a.begin(), a.end(), b.begin(), b.end(), c.begin()); }) { std::cout << x << "\n"; } I thought modelling generalized coroutines<T(T)> as ranges, but I have yet to find a good model. Enough abot the interface; I have some more comments about the Implementation: The code is readable (it helps if you are familiar with the topic) even if, as other reviewers have noted, is almost free of comments. The nest of base classes required to handle all possible coroutine signatures is quite overwhelming though and I think it could be significantly reduced. I was able to trace through the execution paths I was interested into, but it required quite a bit of back and forward between implementation files. The coroutine object itself holds a reference counted pointer to the implementation. As the coroutine is movable but not copyable, I do not think that reference counting is needed. The coroutine itself should consists of two raw pointers: a pointer to the target context and a pointer to the result slot. You need this pointer anyway, because, if I understand correctly, you are planning to allow retriving the result slot at any time after the operator() and yield call. I think that the implementaiton is not as eficient as possible. I strongly believe that a coroutine call should be as fast as a normal functiona call as possible. As such it is important to optimize operator() and yield as much as possible. Ideally, no code should be executed other than calling context::jump_fcontext. I will detail below how such an implementation is possible: Operator() needs to check at every call whether the coroutine has started and decide whether to call impl->start or impl->resume. This is a very predictable branch, but every instruction counts, and as a minimum the branch will increase instruction cache pressure and decoding bandwith. Similarly, there is a branch after the internal resume call to detect whether the coroutine has terminated and another one to check whether exceptions should be thrown. The first branch can be removed by moving the complexity to the constructor: a redy-to-start context should be indistinguishable from a suspended context. The branch on is_complete can be removed by making sure that a resume on on-complete also pass the pointer to the return slot via the native_resume result. The check for pending exceptions can be removed if context switches that cause exceptions to be thrown on the target context are modified to create an artificial stack frame on top of the called stack that runs a function that throws the exception only when necessary. This would require a small (but useful) addition to boost.context interface. The flag manipulation can be removed by completely removing the flags. The only state you really need to know is whether the coroutine is empty (which can be signaled by having a null pointer to context) and whether any result is available (that can be signaled by having a null pointer to result slot). Both pointers are updated by each call to resume(). As the internal context implementation only allows passing a single value, this would require an additional level of indirection. The indirection can be removed by changing the context jump_fcontext api to return a structure of two pointers: the existing parameter plus an additional pointer to the calling context. As most ABI allow to return small structures in registers, this can be implemented quite eficiently. Another source of ineficiency and indirection is the constructor, which while not as critical as task switching itself, it should still be fast. To construct a coroutine two dynamic allocations are currently needed, one to allocate the control block (i.e. the derived class from context base), one to allocate the stack. I suggest completely removing the control block, instead placing the allocator and the coroutine functor itself directly in the stack frame of the trampoline. To do this, after allocating the stack, the constructor switches to the target context trampoline and passes to it apointer to the functor and allocator pairs from constructor stack, which are moved to local variables. This will also implicitly type erase both the allocator and the functor. Similarly, there is noneed to store the two caller and calle fcontextes, which can be mantained as local variables inside inside coroutine::operator(), and yield respectively. The previously described additional parameter returned by impl->resume allows to eficiently retrieve the other context. These stack allocations are not unlike what is aready done for result slot which is located inside yield or operator(). As there is no control block any more, cleanup can no longer be done inside its destructor. Instead, after the context has unwound, the trampoline switches to the calling context stack and executes a cleanup function on top of it (this is the same functionality need to implement the exception throwing as described before). This cleanup function can safely destroy first the coroutine functor and then deallocate the stack. The cleanup function then returns null for both the calling context pointer and result slot, signaling that the coroutine has exited. I have a sketch implementation of most of the suggestions above here https://github.com/gpderetta/Experiments/blob/scheduler/switch.hpp(examples here https://github.com/gpderetta/Experiments/blob/scheduler/switch_test.cc). That's all so far; I'll try to come with further comments; I'm sorry for the wall of text, I wrote it in one go and didn't have time to cut it down. Good luck with the review! -- gpd

Am 17.09.2012 18:24, schrieb Giovanni Piero Deretta:
I like the idea that has been proposed during the review that the result of a coroutine should be retrivable at any time via a get method.I dislike the proposed optional<T> result for operator(). If the previous delayed result functionality is implemented, operator() should just return *this, so that the subsequent get() call can be chained or the result tested for non empty. This change has the potential of deeply affecting the interface, so I would really like to see it subject to a mini review before acceptance. I would not add a coroutine<>::get() method because a coroutine is a 'special function' (in the sense of multiple entry/leave). AFAIK boost::function also does not have a get() method.
The coroutine object has both the concept of a not-a-coroutine and of a completed coroutine. This is confusing, and should be conflated to 'empty' state, analogous to an empty std::function. In addition to simplifying the documentation and the interface, it should also slightly improve the implementation. Operator bool should return !empty as per std::function. OK
It has been suggested that coroutine::self be removed an have a thread specific object. I strongly disagree with this option. First of all it will make it too easy to break type safety:
typedef this_coroutine<int(int)> this_coro;
typedef this_coroutine<void()> that_coro;
coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. }); that_coro::yield() could assert if the internal 'control block' (coroutine_base) is not that of the current coroutine - of course at runtime.
In practice, one need to pass the signature to any code that wants to yield anyway, so might as well pass the self object around, at least one can rely on template type deduction. some of the community members are not happy with tracking the self object trough the code
Also, one does not really want arbitrary code to yield out of the coroutine as this can be dangerous (think of code between yield statements as a sort of critical section) as invariants can be broken or mutices held. you can not prevent user doing such stuff, IMHO
Instead I suggest doing away with a separate self type and instead making the first parameter to the coroutine functor a full fledged coroutine, with simmetric signature:
coroutine<int(void)> my_coroutine([](coroutine<void(int)> caller) { caller(10); });
int i = my_coroutine(); I'm sure that users asking for a typedef in coroutine<int(void)> for the first parameter of the coroutine function
typedef coroutine< int(void) > coro_t; typedef coro_t::caller_t caller_t; coro_t my_coroutine( []( caller_t caller) { caller( 10); }); int i = my_coroutine(); in the case of the reviewed library the coro_t::caller_t from the example above is equivalent to coro_t::self_t provided by the lib. The coro_t::self_t type is not a instance of the original coroutine, it is a new instance of coroutine_self<> (not derived from a coroutine's base classes). If I would rename coroutine_self<>::yield() to coroutine_self<>::operator() I get pretty much the same as you described in your example. Of course self_t is not a coroutine but it provides 'coroutine' - semantics (== context jumps).
I do not like the separation between coroutines and generators. Instead the generator class should be dropped. I disagree because a coroutine<> is a 'specialized' routine and a generator<> provides a sequence of elements. the requirement of coroutine<> is that its coroutine-fn must have return value (a last return statement in its body) and a generator<> requires that the generator-fn returns always void (return values passed via self_t::yield()).
The reason that I had generator as a separate class in the original coroutine SoC is because I wanted to give it iterator semantics (this also required internally using a reference counted pointer to make the iterator copiable, which I disliked). a requirement of iterators is that its source/container is still valid during its live-time. why not require the same for generators? the iterators of generator<> take only a reference to the associated generator object.
Instead I propose to make every coroutine<T()> fulfill the the input range concept and every coroutine<void(T)> fulfill the output range concept (this is a functionality not covered by generators). In practice it means that for these classes of coroutines, begin and end are defined returning appropriate iterator (i.e. some very thin layers around the coroutine itself). As a range is required to survive its iterator anyway, the iterators can simply use plain pointers to the coroutine.
The last suggestion works very well if coupled with the suggestion of making 'self' a full coroutine:
std::vector<int> a = { ... }; std::vector<int> b = { ... };
std::cout << "The result of 'merge(a,b) is:\n"; for(auto x : callcc([&](coroutine<void(int)> c) { std::merge(a.begin(), a.end(), b.begin(), b.end(), c.begin()); }) { std::cout << x << "\n"; }
I thought modelling generalized coroutines<T(T)> as ranges, but I have yet to find a good model. I've read the docu of boost.range - the requirements of a single pass range is:
- For any Range |a|, |[boost::begin(a),boost::end(a))| is a valid range, that is, |boost::end(a)| is reachable from |boost::begin(a)| in a finite number of increments. -> might be violated by an infinite loop inside the coroutine-fn - . A Range provides iterators for accessing a half-open range |[first,one_past_last)| of elements and provides information about the number of elements in the Range. -> it is for a coroutine or a generator impossible to determine the number of elements is it correct to apply the range concept to coroutine/generator?
The nest of base classes required to handle all possible coroutine signatures is quite overwhelming though and I think it could be significantly reduced. I was able to trace through the execution paths I was interested into, but it required quite a bit of back and forward between implementation files. I used static polymorphism and curiously recurring template pattern in order to reduce the partial template specialisations (regarding to return type and parameter types of the coroutine-fn).
The coroutine object itself holds a reference counted pointer to the implementation. As the coroutine is movable but not copyable, I do not think that reference counting is needed. The implementation uses a intrusive pointer - it has the footprint (memory) of a raw pointer and the functions used to increment/decrement the ref-counter are inlined - you pay only of incrementing/decrementing an integer. The reference count is required because I creating coroutine_self<> which gets the ref-counted ptr.
The coroutine itself should consists of two raw pointers: a pointer to the target context and a pointer to the result slot. You need this pointer anyway, because, if I understand correctly, you are planning to allow retriving the result slot at any time after the operator() and yield call. no, I don't think not.
The first branch can be removed by moving the complexity to the constructor: a redy-to-start context should be indistinguishable from a suspended context. I was thinking about how to remove the branch in the resume() member function - I'll try to implement your suggestion.
The branch on is_complete can be removed by making sure that a resume on on-complete also pass the pointer to the return slot via the native_resume result. OK
comments? regards, Oliver

On Tue, Sep 18, 2012 at 9:37 PM, Oliver Kowalke <oliver.kowalke@gmx.de>wrote:
Am 17.09.2012 18:24, schrieb Giovanni Piero Deretta:
I like the idea that has been proposed during the review that the result of a coroutine should be retrivable at any time via a get method.I dislike
the proposed optional<T> result for operator(). If the previous delayed result functionality is implemented, operator() should just return *this, so that the subsequent get() call can be chained or the result tested for non empty. This change has the potential of deeply affecting the interface, so I would really like to see it subject to a mini review before acceptance.
I would not add a coroutine<>::get() method because a coroutine is a 'special function' (in the sense of multiple entry/leave). AFAIK boost::function also does not have a get() method.
well, if I read your discussion with Vicente correctly,you are planning to add either a bind or a get method to self to bind parameters. For simmetry the same solution should be used for the coroutine itself, especially if I get to convince you to use the same class template for both the coroutine and the self object.
It has been suggested that coroutine::self be removed an have a thread
specific object. I strongly disagree with this option. First of all it will make it too easy to break type safety:
typedef this_coroutine<int(int)> this_coro;
typedef this_coroutine<void()> that_coro;
coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. });
that_coro::yield() could assert if the internal 'control block' (coroutine_base) is not that of the current coroutine - of course at runtime.
Not sure how would you implement the above. The only way I can think is to keep the signature at runtime (with typeinfo or equivalent) and compare it with the yield signature, but such a comparison might be expensive.
In practice, one need to pass the signature to any code that wants to
yield anyway, so might as well pass the self object around, at least one can rely on template type deduction.
some of the community members are not happy with tracking the self object trough the code
You could make the same argument for every function that takes a callback (for example asio callbacks, or even for_each): instead of making the callback arguments explicit, you may thread them thru a sort-of dynamically scoped variable. Yes, is convenient (this is what Perl sometimes does) but I would argue it is not a good design (again, Perl). The user has always the possiblity of adding this functionality himself.
Also, one does not really want arbitrary code to yield out of the
coroutine as this can be dangerous (think of code between yield statements as a sort of critical section) as invariants can be broken or mutices held.
you can not prevent user doing such stuff, IMHO
Of course you can't but that doesn't mean you have to hand the user aloaded gun :)
Instead I suggest doing away with a separate self type and instead making
the first parameter to the coroutine functor a full fledged coroutine, with simmetric signature:
coroutine<int(void)> my_coroutine([](coroutine<**void(int)> caller) { caller(10); });
int i = my_coroutine();
I'm sure that users asking for a typedef in coroutine<int(void)> for the first parameter of the coroutine function
typedef coroutine< int(void) > coro_t; typedef coro_t::caller_t caller_t;
coro_t my_coroutine( []( caller_t caller) { caller( 10);
});
int i = my_coroutine();
in the case of the reviewed library the coro_t::caller_t from the example above is equivalent to coro_t::self_t provided by the lib. The coro_t::self_t type is not a instance of the original coroutine, it is a new instance of coroutine_self<> (not derived from a coroutine's base classes). If I would rename coroutine_self<>::yield() to coroutine_self<>::operator() I get pretty much the same as you described in your example. Of course self_t is not a coroutine but it provides 'coroutine' - semantics (== context jumps).
I'm strongly arguing that they should be exactly the same type (well, except for the symmetric signature) and have exactly the same behaviour. Removing the distinction between the calle and the caller would actually simplify the implementation and will also be truer to the 'coroutine' name. I do not like the separation between coroutines and generators. Instead the
generator class should be dropped.
I disagree because a coroutine<> is a 'specialized' routine and a
generator<> provides a sequence of elements. the requirement of coroutine<> is that its coroutine-fn must have return value (a last return statement in its body) and a generator<> requires that the generator-fn returns always void (return values passed via self_t::yield()).
I won't argue strongly about this. Iff you add the bind/get parameter to caller_t and unifiy caller_t and coroutine, then you no longer have an useful distinction between the two. Also note that the current design (coroutine + generator) still leaves the output iterator space unfilled.
The reason that I had generator as a
separate class in the original coroutine SoC is because I wanted to give it iterator semantics (this also required internally using a reference counted pointer to make the iterator copiable, which I disliked).
a requirement of iterators is that its source/container is still valid during its live-time. why not require the same for generators? the iterators of generator<> take only a reference to the associated generator object.
Sure, of course.
Instead I propose
to make every coroutine<T()> fulfill the the input range concept and every coroutine<void(T)> fulfill the output range concept (this is a functionality not covered by generators). In practice it means that for these classes of coroutines, begin and end are defined returning appropriate iterator (i.e. some very thin layers around the coroutine itself). As a range is required to survive its iterator anyway, the iterators can simply use plain pointers to the coroutine.
The last suggestion works very well if coupled with the suggestion of making 'self' a full coroutine:
std::vector<int> a = { ... }; std::vector<int> b = { ... };
std::cout << "The result of 'merge(a,b) is:\n"; for(auto x : callcc([&](coroutine<void(int)**> c) { std::merge(a.begin(), a.end(), b.begin(), b.end(), c.begin()); }) { std::cout << x << "\n"; }
I thought modelling generalized coroutines<T(T)> as ranges, but I have yet to find a good model.
I've read the docu of boost.range - the requirements of a single pass range is:
- For any Range |a|, |[boost::begin(a),boost::end(**a))| is a valid range, that is, |boost::end(a)| is reachable from |boost::begin(a)| in a finite number of increments. -> might be violated by an infinite loop inside the coroutine-fn
technically violated, but in practice, because of the halting problem, the two are are indistinguishable.
- . A Range provides iterators for accessing a half-open range |[first,one_past_last)| of elements and provides information about the number of elements in the Range. -> it is for a coroutine or a generator impossible to determine the number of elements
so is for an input range and an output range. What about a range of boost::function_input_iterators? I'll argue that Boost.Range concept is overconstrained. Also Boost.Range requires the range to be copyable. This means that in c++11, std::vector<my_movable_only_type> would not be a range. The requirement should be relaxed to movable.
is it correct to apply the range concept to coroutine/generator?
Yes!
The coroutine object itself holds a reference counted pointer to the
implementation. As the coroutine is movable but not copyable, I do not think that reference counting is needed.
The implementation uses a intrusive pointer - it has the footprint (memory) of a raw pointer and the functions used to increment/decrement the ref-counter are inlined - you pay only of incrementing/decrementing an integer. The reference count is required because I creating coroutine_self<> which gets the ref-counted ptr.
by tweaking the implementation the coroutine_self type and the coroutine wouldn't need to share anything. HTH, -- gpd

Am 19.09.2012 18:18, schrieb Giovanni Piero Deretta:
well, if I read your discussion with Vicente correctly,you are planning to add either a bind or a get method to self to bind parameters. For simmetry the same solution should be used for the coroutine itself, especially if I get to convince you to use the same class template for both the coroutine and the self object.
by tweaking the implementation the coroutine_self type and the coroutine wouldn't need to share anything. HTH,
I'm uncertain if coroutine<> or self_t should be the first param of coroutine-fn. At least I want to implement Vicente's idea of updating the coroutine-fn parameters after each yield(). I believe your suggestion of passing a coroutine<> with 'inverted' signature will not be compatible if the coroutine-fn has more than one parameter. How would the signature of the coroutine (first parameter in fn instead of self_t) look like? maybe the first parameter is a factory? typedef coroutine< int( int, string &) > coro_t; int f( coro_t::self_t & self, int x, string & s) { coroutine< ? > coro = self.bind( x, s); // x == 3, s == "abc" coro( 7); // x == 11, s == "xyz" reutnr -1; } coro_t coro( f); int res = coro( 3, "abc"); res = coro( 11, "xyz"); I want to force the user to call bind() (memebr of some special class?) to provide the address of the parameters and return an object (coroutine<>?) used for yielding the coroutine-fn. Oliver

On Mon, Sep 17, 2012 at 11:24 AM, Giovanni Piero Deretta <gpderetta@gmail.com> wrote:
About the design:
I want to comment on this proposed design but not argue if it should replace what is currently under review.
I like the idea that has been proposed during the review that the result of a coroutine should be retrivable at any time via a get method. I dislike the proposed optional<T> result for operator(). If the previous delayed result functionality is implemented, operator() should just return *this, so that the subsequent get() call can be chained or the result tested for non empty. This change has the potential of deeply affecting the interface, so I would really like to see it subject to a mini review before acceptance.
In this case, get() or operator* (like in your sketch impl) should return T&, not T. Otherwise you get a surprise move: T x = *c; // OK, value has been moved into x. T y = *c; // y contains an empty value
It has been suggested that coroutine::self be removed an have a thread specific object. I strongly disagree with this option. First of all it will make it too easy to break type safety:
typedef this_coroutine<int(int)> this_coro;
typedef this_coroutine<void()> that_coro;
coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. });
This is a good point. I now think I'd prefer type safety here so I agree that using a thread specific object is bad.
Instead I suggest doing away with a separate self type and instead making the first parameter to the coroutine functor a full fledged coroutine, with simmetric signature:
coroutine<int(void)> my_coroutine([](coroutine<void(int)> caller) { caller(10); });
int i = my_coroutine();
I like this.
This means that the parameter no longer represent the coroutine itself, but, more logically, the caller coroutine. This makes very obvious the analogy between coroutines and channels. It also blurs the division between symmetric and asymmetric coroutines. Bonus points if you had a function that infers the coroutine singnature from the coroutine functor operator() (I like to call this function callcc for obvious reasons [ http://en.wikipedia.org/wiki/Call-with-current-continuation]):
auto my_coroutine = callcc([](coroutine<void(int)> caller) { ... } );
I agree that there're similarities with call/cc but I'm not sure it's exactly the same. call/cc returns the "yielded" value, not a coroutine. It only returns a coroutine if the "yield" was bundled with another call to call/cc. So to avoid the confusion, a different name might be best.
I do not like the separation between coroutines and generators. Instead the generator class should be dropped. The reason that I had generator as a separate class in the original coroutine SoC is because I wanted to give it iterator semantics (this also required internally using a reference counted pointer to make the iterator copiable, which I disliked). Instead I propose to make every coroutine<T()> fulfill the the input range concept and every coroutine<void(T)> fulfill the output range concept (this is a functionality not covered by generators).
While I like this suggestion, I worry that having one template specialization model one concept and another specialization model another concept (Input and Output iterators) might be too confusing.
The last suggestion works very well if coupled with the suggestion of making 'self' a full coroutine:
std::vector<int> a = { ... }; std::vector<int> b = { ... };
std::cout << "The result of 'merge(a,b) is:\n"; for(auto x : callcc([&](coroutine<void(int)> c) { std::merge(a.begin(), a.end(), b.begin(), b.end(), c.begin()); }) { std::cout << x << "\n"; }
This does look compelling. Also, the sketch implementation's continuation class is copyable. Conceptually, continuation/coroutine encompasses a resource (context) and should be movable only (like other resources: e.g. file handle wrappers). It would also be great if the coroutine function could return void instead of making the user return (moved) coroutine. I understand why it's needed from the implementation point of view but I wonder if there's a way around it. Regards, Eugene

On Tue, Sep 18, 2012 at 10:45 PM, Eugene Yakubovich <eyakubovich@gmail.com>wrote:
On Mon, Sep 17, 2012 at 11:24 AM, Giovanni Piero Deretta
I like the idea that has been proposed during the review that the result of a coroutine should be retrivable at any time via a get method. I dislike the proposed optional<T> result for operator(). If the previous delayed result functionality is implemented, operator() should just return *this, so that the subsequent get() call can be chained or the result tested for non empty. This change has the potential of deeply affecting the interface, so I would really like to see it subject to a mini review before acceptance.
In this case, get() or operator* (like in your sketch impl) should return T&, not T. Otherwise you get a surprise move: T x = *c; // OK, value has been moved into x. T y = *c; // y contains an empty value
Yes, definitely, get() should return a reference. BTW, I initially thought that operator* and operator++ would work well for the coroutine interface, to mimic the iterator interface, but I think it was a mistake. An explicit get() is better.
It has been suggested that coroutine::self be removed an have a thread specific object. I strongly disagree with this option. First of all it
will
make it too easy to break type safety:
typedef this_coroutine<int(int)> this_coro;
typedef this_coroutine<void()> that_coro;
coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. });
This is a good point. I now think I'd prefer type safety here so I agree that using a thread specific object is bad.
Help me convince Oliver then ;).
This means that the parameter no longer represent the coroutine itself, but, more logically, the caller coroutine. This makes very obvious the analogy between coroutines and channels. It also blurs the division
between
symmetric and asymmetric coroutines. Bonus points if you had a function that infers the coroutine singnature from the coroutine functor operator() (I like to call this function callcc for obvious reasons [ http://en.wikipedia.org/wiki/Call-with-current-continuation]):
auto my_coroutine = callcc([](coroutine<void(int)> caller) { ... } );
I agree that there're similarities with call/cc but I'm not sure it's
exactly the same. call/cc returns the "yielded" value, not a coroutine. It only returns a coroutine if the "yield" was bundled with another call to call/cc. So to avoid the confusion, a different name might be best.
Also technically what it captures is not a full continuation but a one shot continuation so the name is misleading; the literature uses callcc1 for them. The name is a bit obscure and it was an half joke anyway.
I do not like the separation between coroutines and generators. Instead the generator class should be dropped. The reason that I had generator as a separate class in the original coroutine SoC is because I wanted to give it iterator semantics (this also required internally using a reference counted pointer to make the iterator copiable, which I disliked). Instead I propose to make every coroutine<T()> fulfill the the input range concept and every coroutine<void(T)> fulfill the output range concept (this is a functionality not covered by generators).
While I like this suggestion, I worry that having one template specialization model one concept and another specialization model another concept (Input and Output iterators) might be too confusing.
It is not unlike an std::function<void(T)> modelling an UnaryFunction and an std::function<T()> modelling a Generator.
The last suggestion works very well if coupled with the suggestion of making 'self' a full coroutine:
std::vector<int> a = { ... }; std::vector<int> b = { ... };
std::cout << "The result of 'merge(a,b) is:\n"; for(auto x : callcc([&](coroutine<void(int)> c) { std::merge(a.begin(), a.end(), b.begin(), b.end(), c.begin()); }) { std::cout << x << "\n"; }
This does look compelling.
Also, the sketch implementation's continuation class is copyable.
the continuation in my sketch is movable only: 97: continuation(const continuation&) = delete;
It would also be great if the coroutine function could return void instead of making the user return (moved) coroutine. I understand why it's needed from the implementation point of view but I wonder if there's a way around it.
I thought about it, but I couldn't find any way around. As a plus it allows a few tricks like exiting to a different continuation than the caller. Anyways, I think it becomes natural after a while. Also note that you do not really need the explicit move on return in c++11. -- GPD

Le 19/09/12 01:47, Giovanni Piero Deretta a écrit :
On Tue, Sep 18, 2012 at 10:45 PM, Eugene Yakubovich <eyakubovich@gmail.com>wrote:
On Mon, Sep 17, 2012 at 11:24 AM, Giovanni Piero Deretta
It has been suggested that coroutine::self be removed an have a thread specific object. I strongly disagree with this option. First of all it will make it too easy to break type safety:
typedef this_coroutine<int(int)> this_coro;
typedef this_coroutine<void()> that_coro;
coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. });
This is a good point. I now think I'd prefer type safety here so I agree that using a thread specific object is bad.
Help me convince Oliver then ;).
Hi, I didn't know that Oliver was for the thread specific object design. I'm not sure the alternative I proposed is the best one. Compile type safety is too important to be lost. Unfortunately I don't have a solution that is type safe and that avoid to pass an additional parameter to the coroutine function. Oliver, I don't know if you can take care of both alternatives, so that we can experiment with. If it is not the case, it seems that maintaing the self/caller parameter is safer. The thread specific design could always be built on top of actual design, isn't it? Best, Vicente

On Wed, Sep 19, 2012 at 2:20 AM, Vicente J. Botet Escriba < vicente.botet@wanadoo.fr> wrote:
Le 19/09/12 01:47, Giovanni Piero Deretta a écrit :
On Tue, Sep 18, 2012 at 10:45 PM, Eugene Yakubovich <eyakubovich@gmail.com>wrote:
On Mon, Sep 17, 2012 at 11:24 AM, Giovanni Piero Deretta
It has been suggested that coroutine::self be removed an have a thread
specific object. I strongly disagree with this option. First of all it
will
make it too easy to break type safety:
typedef this_coroutine<int(int)> this_coro;
typedef this_coroutine<void()> that_coro;
coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. });
This is a good point. I now think I'd prefer type safety here so I agree that using a thread specific object is bad.
Help me convince Oliver then ;).
Hi,
I didn't know that Oliver was for the thread specific object design.
I'm not sure the alternative I proposed is the best one. Compile type safety is too important to be lost. Unfortunately I don't have a solution that is type safe and that avoid to pass an additional parameter to the coroutine function.
Why do you feel it is important not to pass the additional parameter?
Oliver, I don't know if you can take care of both alternatives, so that we can experiment with. If it is not the case, it seems that maintaing the self/caller parameter is safer. The thread specific design could always be built on top of actual design, isn't it?
A trivial adapter around the coroutine-fn would do it.

On Wed, Sep 19, 2012 at 2:20 AM, Vicente J. Botet Escriba < vicente.botet@wanadoo.fr> wrote:
Le 19/09/12 01:47, Giovanni Piero Deretta a écrit :
On Tue, Sep 18, 2012 at 10:45 PM, Eugene Yakubovich <eyakubovich@gmail.com>wrote:
On Mon, Sep 17, 2012 at 11:24 AM, Giovanni Piero Deretta
It has been suggested that coroutine::self be removed an have a thread
specific object. I strongly disagree with this option. First of all it
will
make it too easy to break type safety:
typedef this_coroutine<int(int)> this_coro;
typedef this_coroutine<void()> that_coro;
coroutine<int(int)>([]{ that_coro::yield(); // oops compiles fine, UB at runtime. });
This is a good point. I now think I'd prefer type safety here so I agree that using a thread specific object is bad.
Help me convince Oliver then ;). Hi,
I didn't know that Oliver was for the thread specific object design.
I'm not sure the alternative I proposed is the best one. Compile type safety is too important to be lost. Unfortunately I don't have a solution that is type safe and that avoid to pass an additional parameter to the coroutine function.
Why do you feel it is important not to pass the additional parameter? I don't know which will be the good interface to provide coroutines by
Le 19/09/12 18:21, Giovanni Piero Deretta a écrit : the language itself, but I think that there will not have such a parameter. I wanted to be as close as possible to this hypothetic interface. The same reason was the origin of the bind suggestion.
Oliver, I don't know if you can take care of both alternatives, so that we can experiment with. If it is not the case, it seems that maintaing the self/caller parameter is safer. The thread specific design could always be built on top of actual design, isn't it?
A trivial adapter around the coroutine-fn would do it.
I believe so. This is way I think the library could provide both. Best, Vicente

On Wed, Sep 19, 2012 at 7:58 PM, Vicente J. Botet Escriba < vicente.botet@wanadoo.fr> wrote:
Le 19/09/12 18:21, Giovanni Piero Deretta a écrit :
On Wed, Sep 19, 2012 at 2:20 AM, Vicente J. Botet Escriba <
vicente.botet@wanadoo.fr> wrote:
Hi,
I didn't know that Oliver was for the thread specific object design.
I'm not sure the alternative I proposed is the best one. Compile type safety is too important to be lost. Unfortunately I don't have a solution that is type safe and that avoid to pass an additional parameter to the coroutine function.
Why do you feel it is important not to pass the additional parameter?
I don't know which will be the good interface to provide coroutines by the language itself, but I think that there will not have such a parameter. I wanted to be as close as possible to this hypothetic interface. The same reason was the origin of the bind suggestion.
I do not think it follows. Consider that: - most of the languages with with coroutines/generators are dynamically typed (python, lua), so all coroutine are of the same type: no type mismatch. Still, those with callcc (scheme, ruby) explicity pass the continuation to the callee. - many languages only allow yielding from the coroutine-fn itself (i.e. are stackless), so the compiler can figure out the correct signature (C#, F#) - many modern functional languages have delimited continuations, whose shift and reset operators explicitly pass around the continuation object. -- gpd
participants (4)
-
Eugene Yakubovich
-
Giovanni Piero Deretta
-
Oliver Kowalke
-
Vicente J. Botet Escriba