
----- Original Message ----- From: "Anthony Williams" <anthony.ajw@gmail.com> To: <boost@lists.boost.org> Sent: Saturday, November 01, 2008 11:03 PM Subject: Re: [boost] [threadpool] new version v12
k-oli@gmx.de writes:
Am Samstag, 1. November 2008 19:35:23 schrieb Vicente Botet Escriba:
IMO, the implementation of the fork/join semantics do not need fibers. The wait/get functions can call to the thread_pool scheduler without context-switch. Which are the advantages of using fibers over calling recursively to the scheduler?
Please take alook into the example folder of threadpool. You will find two exmaples for recursivly calculate fibonacci. Configure the pool with tp::fibers_disabled and try to calulate fibonacci(3) with two worker-threads. Your application will block forever. Use the option tp::fiber_enabled and you can calculate any fibonacci number without blocking
I haven't looked at Oliver's use of Fibers, but you don't need to use fibers to do this. Whenever you would switch fibers to a new task, just call the task recursively on the current stack instead. The problem here is that you may run out of stack space if the recursion is too deep: by creating a Fiber for the new task you can control the stack space.
Thanks Anthony to anwer my question. The 'run out of stack space' problem was already the case of the sequential recursive algotithm. Of course with fibers you can avoid the recursion problem, but now you need to reserve a specific stack space for each fiber. IMO tasks should be more lighwitgth that fibers. I'm interested in knowing the overhead of using fibers respect to the recursive call. I'm not saying that fibers are not useful, the fiber library should be useful in its own in a lot of contexts; I'm also waiting the review of the corutine library. Maybe it should be up to the end user to choose between a fibers implementation or a recursive one.
The problem with doing this (whether you use Fibers or just recurse on the same stack) is that the nested task inherits context from its parent: locked mutexes, thread-local data, etc. If the tasks are not prepared for this the results may not be as expected (e.g. thread waits on a task, resumes after waiting and finds all its thread-local variables have been munged).
You are right, this reenforce my initial thinking. We need to separate between tasks (root tasks) and sub_tasks (which will inherit context from its parent task). * In addition we could have a default thread pool (the_thread_pool) and have access to the current task (this_task). How this default thread pool is built must be defined. This could let write things like int seq_fibo( int n) { if ( n <= 1) return n; else return seq_fibo( n - 2) + seq_fibo( n - 1); } int par_fibo( int n) { using namespace boost::tp; if ( n <= 3) return seq_fibo( n); else { sub_task< int > t1(this_task::submit(par_fibo, n-1)); sub_task< int > t2(this_task::submit(par_fibo, n-2)); return t1.get() + t2.get(); } } int fibo( int n) { using namespace boost::tp; task< int > t(the_thread_pool::submit(fibo, n)); return t.get_future().get(); } * Independently of whether the implementation uses fibers or a recursive call the working thread scheduler, there are other blocking functions that could be wrapped to use the current working thread to schedule other tasks/sub_tasks doing a busy wait instead of a blocking wait. For example a task can wait on a condition that can be provided by other tasks/sub_tasks, so the end user could be able to write something_like this_working_thread::wait(condition); So I think that the library must provide a mechanism allowing writing this kind of wrappers providing a one_step_schedule function. void this_working_thread::wait(boost::condition cnd) { while (!cnd.try_wait()) { this_working_thread::one_step_schedule(); } } * Just one additional feature I would like to see on boost, which could be included on the thread pool library or on a separated library. I would like to be able to submit/schedule tasks at a given time (callouts) which could be oneshot or periodics. Something like timeout_task to = the_thread_pool::submit_at(absolute_time, funct); timeout_task to = the_thread_pool::submit_at_periodically(absolute_time, period, funct); Of course the scope of the library will be wider, and the TaskScheduler could be more adequated name for the library. The implementation of the timeouts scheduler could be based on "Redesigning the BSD Callout and Timer Facilities" by Adam M. Costello, George Varghese http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=B5202FC949FF3EDB0E789F68F509950C?doi=10.1.1.54.6466&rep=rep1&type=pdf BTW Olivier, * could the interrupt function extract the task from the queue if the task is not running already? * as a task can be only on one queue maybe the use of intrusive containers could improve the performances * the fiber queue is a std::list and you use size function which could have a O(n) complexity. This should be improved in some way (intrusive container provides already a constants_size list implementation). Best, Vicente