
Hello Oliver,
I've uploaded a new version of boost-threadpool.
changes: * free-function get_default_pool() returning reference to static pool< unbounded_channel< fifo > > * task has new function result() returning the internal shared_future * future-like functions removed from task * boost-threadpool not a header-only library
Unfortunately, I am unable to compile the tp library. TT default_pool.cpp doesn't compile... I edited it enough to have it compile but afterward it cannot link with boost thread. I did a work around in including the .cpp directly into my project but I thought I should let you know. struct static_pool { static default_pool instance; }; } inline default_pool & get_default_pool() { return detail::static_pool::instance; } } By the way, why didn't you write it as: inline default_pool & get_default_pool() { static default_pool instance(poolsize(thread::hardware_concurrency())); return instance; } This eliminates the need of a .cpp for default_pool. Another remark is that I am not sure I understand the raison d'ĂȘtre of the poolsize type. What is the added value to say, "int" or std::size_t. This prevents us from writing default_pool instance(thread::hardware_concurrency()); (Or any other value for that matter) When scheduling several tasks with submit, I find it odd that I don't have a : pool.wait() that would allow me to wait until all tasks are finished (condition_variable-like pattern for example). I'm currently writing a simple parallel_sort for the threadpool library to deny or confirm Vicente discoveries. A simple parallel_sort where I would split the input into blocks of size 1000, sort these blocks in parallel and then merge them (at first in a serial manner, then in a parallel manner, depending on how much time I can allocate on this task). Some tests done with parallel "filling" on a concurrent_vector show that tbb does better but threadpool is not humiliated: std::fill reverse 0..1000000 ok : 1000000 elapsed: 0.403 boost::tp:fill reverse 0..1000000 ok : 1000000 elapsed: 0.274 tbb::parallel_for fill 0..1000000 ok : 1000000 elapsed: 0.261 Run made on a Q6600, we're not going much faster for hardware reasons I think, I don't remember exactly the Q6600 architecture but I believe the four cores don't all get direct memory access (2x2 I think?), notwithstanding the memory bandwidth limitation. This is how I implemented tp::fill: template <typename RandomIterator, typename Value> void parallel_fill(RandomIterator first, RandomIterator last, Value v) { typedef typename RandomIterator::difference_type difference_type; difference_type n = std::distance(first, last); boost::tp::default_pool p(boost::tp::poolsize(4)); RandomIterator l = first; RandomIterator i = l; const difference_type block_size = 1000; typedef boost::tp::task<void> task_type; for(;i != last; l = i) { std::advance(i, std::min BOOST_PREVENT_MACRO_SUBSTITUTION(std::distance(i, last), block_size)); BOOST_ASSERT(l < i); BOOST_ASSERT(l != last); BOOST_ASSERT(i != first); p.submit(boost::bind(&std::fill<RandomIterator, Value>, l, i, v)); } } Kind regards. -- EA