Re: [boost] asio formal review

22 Dec 2005

      Documentation

The design documentation is very terse and makes some rather sweeping
statements. At first, I thought a service was some rather
under-documented concept, but in the reference section I discover that
services appear to be a mixed bag of things, implementing some of the
concepts listed. I really think the design documentation needs more
depth and supporting rationale.

The reference section needs extending with a more usage oriented
section or sections, as well as documentation on customisation.

The tutorials are inadequate. I would suggest at least a simple
multi-session tcp/ssl echo server, and a tcp/ssl client able to be
used interactively and for stress testing. Similarly for udp.

Design

I think the design as described by the concepts section is very good,
but it would be better if the implementation offered finer-grained
selection of the desired concepts. Perhaps this is just an aesthetic
concern, and it would be easy to modify the existing services, but if
these really are separate concepts, why are they bundled together?
I'm concerned that there may be some lack of flexibility in the
implementation/interface actually offered that isn't apparent from
the design. The issue of separating out the sync vs async interface
has already been raised; I would also like to see the ability to
expose a reactive rather than proactive interface by using a
reactive_stream_socket_service in place of an
async_stream_socket_service. The concepts would seem to allow
for this.

One item that I have difficulty in discerning a clear design/rationale
for is the demuxer. The demuxer as implemented is a repository of
services (another wooly concept afaiks) and something that provides
the run, interrupt and related functions to control the
scheduling/dispatching of handlers in reaction to changes in
Async_Object state. Is there anything preventing a more policy-based
approach to assembling services? I would like to see the aspects of
a socket that have nothing to do with reading/writing better
separated, as this would make for a more consistent interface for
other forms of I/O such as pipes, message queues etc as well as files.

A very simple active object template that supports read and write, but
that relies on free functions or helper classes for less universal
operations such as bind, setting options, etc, used as a base class,
allows appropriate (and necessary in my opinion) use of runtime
polymorphism.

Taken to the extreme, the base class is just an Async_Object - and
doesn't even support read and write.

I've used this general approach in an in-house library with some
success, where an active_file_descriptor (basically asio's
Async_Object concept) is created through some form of "opener" such as
an acceptor or a connector. There is very little you can do with this
object, except to construct a (enforced that there can only be 1 of
each) read_active_file_descriptor or write_active_file_descriptor from it.
This extends similarly for message oriented rather than stream-based
"file descriptors". I should mention this is a reactor model - not a
proactor, but I think the general ideas are still applicable. The read
and write AFDs each carry a handler functor of course.

The intention was for this general idea to support other types of
"file descriptor" that don't offer conventional read and write
interfaces, and to allow other interfaces (such as proactive) to
use the same demuxing etc infrastructure.

I had hoped to have time to investigate introducing asio as a
replacement/alternative to this library, but the timing hasn't been
gode for that. In any case, I don't think asio would be a good fit for
a system like the following (where the other lib is being used):

Components support tcp and ssl for WAN connections and unix domain
sockets locally. Throw in some weird stuff like communicating over
special purpose point-to-point links implemented as character devices.
There are large chunks of code that have no need to care what exact
type of "socket" they are using. Some of this code is actually
in dynamically linked libraries itself. There are multiple processes
using these libraries.

Short of placing a wrapper around asio to make it more amenable to
separate compilation and to introduce runtime polymorphism I can't
see how I could use asio effectively this system, or anything similar.
As far as I can see asio doesn't even allow interchangeable use of an
ssl and a tcp stream. I think this is unacceptable in a general
purpose library. I'm also concerned about the potential code bloat
from unnecessary duplication of code (not from differnt template
parameters, but from outright duplication in differnt templates).

Implementation

The code is clean, however I'm very concerned about the performance
(or lack of) and weight that a combination of the implementation and
some design decisions have introduced. The documentation mentions
the memory usage overhead of the proactor emulation on systems using
select/epoll. However it is clear that there is considerable
allocation overhead and construction overhead involved (and not only
on those systems). While much discussion has centered on allocator
performance, I'm concerned that the user/library constructor
code invoked for each i/o operation is likely to be non-trivial as
well. For bulk transfers, it is likely that these issues won't be too
severe, however for systems where low-latency handling of small
messages is critical, it is quite possible that these overheads will
be comparable to or larger than any other part of the processing for a
message/transaction. I find the way the design supports, and the
documentation encourages, breaking up handling of structured messages
into a small header read followed by a larger read (with appropriate
handler for expected body/chunk) while the implementation appears to
do some quite heavyweight operations compared to just reading a small
header to be almost an encouragement of an antipattern for efficient
I/O. I would expect an optimistic reactive approach (read the header
when notified, determine size, handler etc and opertunistically
attempt a read of the body) to be significantly faster.

It is implied that an appropriate demuxer/demuxer service can be used
to implement various multi-threaded processing schemes. This may well
be true. However, an alternative approach is to simply consider the
affinity of an Async_Object and its demuxer(s) (one for read, one for
write) to be dynamic and run a number of separate (per thread)
demuxers, moving Async_Objects between them as required. In this way
the handlers for a given Async_Object will only run in the expected
context. I imagine other approaches are possible to achieve similar
results, but it is not clear (to me) from the design documentation
or a quick look at the code that this is the case. I haven't reviewed
the code to a level of detail where I can be sure I am understanding
the dispatching done by the epoll and select reactors, but it looks
to me as though multiple threads may all wait on the same fdset
(or equivalent)? Is this right? Intended? I had expected some form of
modified leader/followers pattern here?

Conclusion

As I mentioned above, had the review been at a different time I would
have at least attempted to port some code (even if only test/benchmark
code) from an in-house reactor library to asio. As it is my comments
are based on a fair bit of experience with network and async (in its
most general sense) I/O from the bit-level (literally) up to the
application level on a number of OSs/kernels/platforms and libraries
(including ACE) and I've spent a few hours reading the code and docs.
I have a healthy distrust of my (and everyone elses) guesses regarding
performance, and the author and review manager should consider my
points above regarding performance as areas for investigation only.

A fair amount of the above appears to be a request for reactor support
(which it is, partly) but I would consider mechanisms that support
very light-weight proactor use highly desirable irrespective of
reactor support. In particular, it would seem that some form of
re-use of a single handler and buffer for an Async_Object instance
without any allocation/(copy)construction of anything heavier than a
pointer should be quite practical. That along with support for
"socket type" runtime polymorphism would be the minimum set of changes
I would consider to be necessary for acceptance (along with the
inevitable better docs request).

At this point, I vote no, but the library is very promising and the
concepts generally well thought out. I would like to encourage
Christopher to continue the already excellent work.

Regards
Darryl Green.