[boost] Critique of Boost.Threads

4 May 2005

      With Kevlin Henney's permission, the following is a repost of a message he 
recently posted on the C++ committee's library reflector. It gives his 
views on Boost.Threads in the context of possible standardization.

--Beman

-------

In considering a threading library for C++ based on a fairly standard
set of primitives, there are essentially two parts to such a library and
it is worth considering these separately: the actual threading part, ie
the part of the library concerned with the creation, execution and
management of separate threads of execution; the synchronisation part,
ie the part of the library concerned with locking. The two parts are
obviously related in terms of execution, but there is no necessary
relationship between them in terms of interface dependencies. These two
parts are also in addition to the necessary guarantees of a reasonable
memory model and other primitives appropriate for lock-free programming,
which are also under related but separate discussion.

For the C++ standard library a reasonable goal for both the threading
and the synchronisation parts would be to adopt a generic-programming
style of design. Although Boost.Threads uses templates in expressing
some of its capabilities, it is a relatively closed design wrt
extension, and hence not what would normally be termed generic. However,
Boost.Threads does offer a starting point, and with only a few
considerations and changes it is relatively easy to evolve a more open
and appropriate design from it.

Considering first the question of the threading part, Boost.Threads is
currently based on the idea that a thread is identified with an object
that launches it. This notion is somewhat confused by the idea that on
destruction the thread object is destroyed but the thread is not -- in
other words the thread is not identified the thread object... except
when it is. There needs to be a separation of concepts here, which I
will come back to in a moment.

Another appropriate separation is the distinction between initialisation
and execution. These are significantly different concepts but they are
conflated in the existing thread-launching interface: the constructor is
responsible both for preparing the thread and launching it, which means
that it is not possible for one piece of code to set up a thread and
another to initiate it separately at its own discretion, eg thread
pools. Separating the two roles into constructor and executor function
clears up both the technical and the conceptual issue. The executor
function can be reasonably expressed as an overloaded function-call
operator:

         void task();
         ...
         thread async_function;
         ...
         asynch_function(task);

The separation also offers a simple and non-intrusive avenue for
platform-specific extension of how a thread is to execute: configuration
details such as scheduling policy, stack size, security attributes, etc,
can be added as constructors without intruding on the signatures of any
other function in the threading interface:

         size_t stack_size = ...;
         security_attributes security(...);
         thread async_function(stack_size, security);

The default constructor would be the feature standardised, and an
implementation would be free to add additional constructors as
appropriate.

So far, this leads to an interface that looks something like the
following:

         class thread
         {
         public:
                 thread();
                 template<typename nullary_function>
                   void operator()(nullary_function);
                 void join();
                 ...
         };

Given that the same configuration might be used to launch other threads,
and given the identity confusion of a thread being an object except when
it's not, we can consider the interface not to be the interface of a
thread but to be the interface of a thread launcher, also generally
known as an executor:

http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/Executor.htm
l
http://www.cse.buffalo.edu/~crahen/papers/Executor.Pattern.pdf

A thread initiator can submit zero-argument functions and function
objects to an executor for execution:

         threader run;
         ...
         run(first_task);
         run(second_task);

A standard library would offer a standardised threader type, but as a
concept an executor could be implemented in a variety of ways that still
conform to the same basic launching interface, ie the function-call
operator. As a pleasing side effect, the potential confusion over the
name "thread" is also removed: entities are named after their roles.

Given that a threader can be used to launch multiple threads, there is
the obvious question of how to join with each separately run thread.
Instead of returning void, the threader returns an object whose primary
purpose is to represent the ability to join with the completion of a
separately executing thread of control:

         class threader
         {
         public:
                 threader();
                 template<typename nullary_function>
                   joiner operator()(nullary_function);
                 ...
         };

The role played by the joiner in this fragment is that of an
asynchronous completion token, a common pattern for synchronising with
and controlling asynchronous tasks:

http://www.cs.wustl.edu/~schmidt/PDF/ACT.pdf

Via the joiner the initiator can poll or wait for the completion of the
running thread, and control it in other ways -- eg request cancellation
of the task (should we chose to attempt standardising in some form this
often contentious and challenging capability).

The joiner would be a copyable, assignable and default constructible
handle, and as its principal action the act of joining can be expressed
as a function-call:

         joiner wait = run(first_task);
         ...
         wait();

If there are no joiners for a given thread, that thread is detached, a
role currently played in Boost.Threads by the thread destructor:

         run(second_task); // runs detached because return value ignored

The final piece in this sketched refinement is the recognition that
common across many C threading APIs is the ability to return a result
from the function executed asynchronously (void * in Pthreads and DWORD
in Win32). It seems to have become habit to discard this value in more
OO interfaces, giving a result of void. For many threaded tasks this
makes sense, but where a thread is working towards a result then the
idea that an asynchronously executed function can return a value for
later collection should not be discarded. With a void-returning
interface the programmer is forced to set up an arrangement for the
threaded task to communicate a value to the party that wants the
one-time result. This is tedious for such a simple case, and can be
readily catered for by making the joiner a proper future variable that
proxies for the result:

http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/Future.html
http://www.laputan.org/pub/sag/act-obj.pdf
http://kasparov.skife.org/blog/src/futures.html

This leads to a threader interface like the following:

         class threader
         {
         public:
                 threader();
                 template<typename nullary_function>
                   joiner<result_of<nullary_function>::type>
                     operator()(nullary_function);
                 ...
         };

For the common default configured threader, a wrapper function, thread
(as verb rather than noun), can be provided:

         template<typename nullary_function>
           joiner<result_of<nullary_function>::type>
             thread(nullary_function);

And usage as follows:

         void void_task();
         int int_task();
         ...
         joiner<void> wait_for_completion = thread(void_task);
         joiner<int> wait_for_value = thread(int_task);
         ...
         int result = wait_for_value();
         wait_for_completion();

The benefit of programming with futures is that for a certain class of
code that would use end-of-thread synchronisation to pick up results,
programmers are not presented with unnecessarily low-level
synchronisation APIs. The function-based model is applied consistently.
The sequence of steps, and the design considerations involved, to get
from Boost.Threads to this design is hopefully clear and can be seen to
be grounded in a great deal of existing practice, plus the benefit of a
more native C++ appearance.

Now, in terms of synchronisation the Boost.Threads library offers a
number of primitives, such as mutexes, but unfortunately couples them to
a relatively closed programming model. The synchronisation primitives
provided do not lead to a general model for locking and force the user
into using resource acquisition objects when this is not always the best
solution. Given that C++ should aim to be the language and API of choice
for systems-level work, the restriction of a mandatory scoped-locking
interface does not seem appropriate.

A more orthogonal approach based on the capabilities found in common
across C++ synchronisation libraries, Boost.Threads included, would be a
good starting point: lockability and locking strategy are kept separate.

Taking a more generic approach, this means considering a related set of
lock categories, eg Lockable and TryLockable. Something like Lockable
would (syntactically) be little more than a.lock() and a.unlock(), and
something like TryLockable would extend with a.try_lock(). Boost.Threads
has some of this but restricts the interface of the lockable types
themselves. With respect to these categories both primitives (eg mutex)
and higher-level types (other externally locked objects) can be
implemented.

There is no dependency on the locking strategies that can be used
against lockable objects -- scope bound, transaction linked,
smart-pointer based, etc. The library would be expected to provide at
least scope lockers to simplify the tedious and error-prone business of
ensuring that scope-related critical regions are exception safe, but
programmers would be free to define additional ones based on specific
needs. There is also no requirement to pollute the nested scope of a
lockable class with special typedefs.

The Boost.Threads synchronisation library does not appear to satisfy the
goal of openness and genericity. This is in part based on its well
intended but mistaken belief that preventing programmers from using
locks manually is the key to safe code: preventing users from using
multiple threads and locks is the only way to do this. It is easy but
annoying to work around the restriction (eg dynamically allocate the
locker objects and control the lifetime on the heap instead of
automatically wrt scope), but it should not be something that a
programmer should have to work around. The Boost.Threads documentation
itself admits that its restricted approach is extreme, but a more open
and proven -- and less extreme -- approach is probably a better fit for
standardisation.

Therefore, the design we need to aim for is one that recognises that any
use of explicit threading or any form of synchronisation is the
programmer's responsibility. It should make common tasks easier and less
error prone than they might otherwise be -- eg the provision of scope
lockers -- but it should also allow the programmer the freedom to
implement locking strategies as they see fit rather than force them to
work around the API or seek another.

I've only sketched it out, but hopefully there is enough rationale there
to see why, without some refinement, Boost.Threads is not quite the
right interface and execution model to standardise, but also how few
hops it takes to build from Boost.Threads to an alternative API model
that meets a reasonable and reasoned set of design objectives.

Kevlin
--

[boost] Critique of Boost.Threads

Beman Dawes