
First of all, I think the library as some issues and it should not be accepted as it is, but see below. - What is your evaluation of the design? The interface is pretty simple and it seems fine on a first evaluation, but I think there are some issues. The first issue is that the destructor of the context class will (at least by default) happily destroy the stack. This unfortunately will leak any resource owned by any object on the stack; This is not unlike destroying an allocator before having destroyed all allocated object. A stack can only be safely deallocated if either has been completely unwound or if it doesn't own any resources (note that is not enough to just destroy any stack allocated object). Context should either: a) provide an interface to unwind the stack (by automatically trowing a special exception for example), or b) detect whether the stack has not been unwound and assert in debug mode in the destructor. As it is, is not clear the level of abstraction this class wants to be. It looks like is trying, using RAII, to manage context creation and destruction safely, giving a false sense of security. Context is neither copyable nor movable; I can understand the reason for making it non-copyable, but I see no reason why it shouldn't be movable. Without either capability it is hard to make context interoperate with existing code without forcing to dynamic allocation. Context is templated on both the context implementation and the stack. The first problem is that the concept requirements for the Stack template arguments are not stated. Second, the stack type should not be part of the context type: the primary reason for specifying different stacks is for option of having the guard page: enabling this should not change the type of Context and it should possible to specify it at runtime. Basically I'm advocating for type erasure; note that it can (and should) be implemented without additional memory allocation. The trampoline function can be defined in such a way that the custom deleter will be invoked after the stack is unwound. Regarding the CtxT type, instead of this template parameter there should be a program wide default controllable at compile time by a macro; I saw elsethread that Oliver is already leaning on this idea. As it is the context class is IMHO flawed and I'm against its inclusion in boost right. I'll be very happy to change my vote if I see a better implementation. A good abstraction layer around the low level context, among other things, should support clean unwind of the stack (or detecting an not-unwound stack), generic functors (and not just function pointers) for type safety and custom deleters. On the other hand the detail fcontext class seems a good, if maybe not complete, portable implementation of a low level ucontext-like API. I would be in favor of a simpler version of context which consists solely on the fcontext class (with a guarantee that this type is a POD), plus the make_context and swap_context functions. A user will have lower expectations from such a low level API and at the same time will give more freedom for other people to implement on top of it. A safe context class would just be one of the client of the API, other library authors could either use Context of the underlying fcontex directly depending on wether they need the speed or safety. I think this basic API could still be tweaked [but my vote won't be conditional on that] to add extra functionality: * Instead of supporting a 'fc_link' a-la ucontext, have the callback function return a new context (it is void now). 'fc_link' is useful only on some limited cases (usually when you have an external scheduler managing your contextes), while the having the callback return the 'next continuation' has more applications (note that fc_link can be implemented easily in term of the continuation returning variant, but the reverse is harder). This is very useful, for example, to implement some kind of 'exit_to' functionality (unwind the stack and restore another context), a key piece for safe context destruction. * Give to fcontext_swap the ability to exchange a parameter with the other context: fcontext_parm_t swap_fcontext( fcontext_t * ofc, fcontext_t const* nfc, fcontext_parm_t parm) where fcontext_parm_t is either union fcontext_parm_t { void * ptr; int int_; }; or simply intptr_t; Swap_fcontext saves the current context in ocf and restores the context ncf as usual; additionally, if ofc was 'captured' on a call to swap_fcontext, this call will return the value of parm on the original context. This extra parameter is equivalent to the second parameter of longjmp. If ofc was created with make_fcontext, 'parm' will be an extra parameter to the function. Note that because you can't mix and match parameterful and parameterless swap_fcontext, you should only have the former as it can be potentially implemented with little or no overhead. * Add a variant of swap_context that allows executing a function on the destination context (in this case 'parm' would be passed parameter to this function. This is mostly useful for injecting an exception on the destination context. This can be potentially implemented on top of the existing functionality, but would slow down every context switch; a possible efficient implementation would manipulate the destination stack to add a call to the function on top of it, so that the cost is paid only when required. All the previous improvements are implementable unobtrusively on top of the exiting API, but such implementations will be much less efficient. I'm suggesting that they should be part of the basic API becausecan be implemented with zero or near zero overhead when not used. - What is your evaluation of the implementation? The implementation seems reasonably efficient, but I have seen elsethread suggestions to make the constructor more expensive (read multiple memory allocations) in exchange for flexibility; this is IMHO wrong, the constructor should not be assumed to be a non critical operation. In fact I think that most of the proposals can be efficiently implemented without additional memory allocation by reusing the already available stack (which can be reused). Most of the 'throw' statements in the context class should actually be asserts. Make_fcontext requires a call to get_fcontex first, a-la ucontext: there is no reason for this, just merge get_context with make_context (and drop the unused set_fcontext). The posix ucontext wapper and the windows fiber wrapper should be made API compatible with fcontext. Regarding the assembler implementations of fcontext, I think they could be simplified if the fcontext itself was represented as a single pointer to the top of the stack. This would save some instructions to move the source address from the stack return address to the fcontext struct and back. The i386 implementation could use register calling convention to simplify the implementation and speed it up. Both the i386 and amd64 asm implementation use the 'ret' instruction to restore instruction pointer of the destination context. This will always be mispredicted (costing 20 cycles or more, easily more than the rest of the swapping code). Just jmping to the the destination will save a push and will give the CPU a chance to predict the jmp (modern CPUs have some dynamic prediction capability for indirect jumps). Bot the i386 and amd64 asm implementation save and restore the fpu control word (the amd64 both x87 and SSE2 control words). Last time I timed it, this was a relatively an expensive operation, not sure about current architectures; I do not think this should be saved and restored: the compiler will treat it as a caller clobbered variable and will take care of it in case of implicit manipulations. The user should be prohibited from changing it across a swap_fcontext call. As a minimum, there should be a compile time option to disable the saving. Note that many posix implementation of the ucontext API need to restore it (along with the full register set) because on traditional unices a context could be captured asynchronously by a signal, but it is not the case with boost.context. BTW, the current code is already inconsistent because it is saving the SSE2 control world on amd64 but not on i386. For posix systems lacking ucontext, fcontext can be implemented using the sigaltstack trick (this can actually be faster than ucontext if the non signal restoring _longjmp is used). For reference see: http://www.gnu.org/software/pth/rse-pmt.ps Finally, an implementation on top of boost threads would be nice. It will be very slow, but useful for portability and when performance is not an issue; also won't probably be much slower than ucontext. - What is your evaluation of the documentation? The documentation is very minimalist and should be expanded. It assumes the user is already familiar with the context swapping concept. The posix ucontext APIs man page are a good start. The fcontext class should be documented. The first time I read the documentation I understood that the constructor captured the current context a-la setjmp (a very bad idea, IMHO). I had to read the implementation to convince myself it was not the case. - What is your evaluation of the potential usefulness of the library? This is definitely not going to be one of the most used boost libraries, but it is a required piece to implement other libraries like coroutines, user lever threading, tasks, etc. - Did you try to use the library? With what compiler? Did you have any problems? I didn't try to use the library. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? I did an in-depth study of the implementation. - Are you knowledgeable about the problem domain? I consider myself fairly experienced in this domain: As part of the GSoC project Boost.Coroutine I did some pretty in depth research on the topic and I implemented a very similar functionality; BTW, I see that Context borrows many ideas from Boost.Coroutine implementation (some of the tricks in the fiber wrapper look quite familiar); a small acknowledgement would be appreciated :). -- gpd