[asio] Formal Review

30 Dec 2005

      Hi everybody, I hope not to have missed the review deadline...
First of all I vote *not* to accept asio in boost yet. I think that the 
library is very promising but there are some issues.  If the library is 
not accepted, is strongly encourage the author to subimit it again once 
the issues are resolved.

- What is your evaluation of the design?

Promising.
I think that the most important innovation of asio is the dispatcher 
concept and the async call pattern.  Everything else is just *extra*.
It probably should be removed in some way from asio and made a separate 
boost library (or may be just rename asio to something like "asynch lib" 
or whatever).

The interface provided by asio is very low level, but i think this is a 
strenght of asio, not a weakness.  Higher level interfaces (iostreams, 
buffer management etc...) can be added later as external components or 
extensions to the library itself.

The {datagram|stream}_socket interface is nice, but i have some issues 
with them.

First of all the name "socket" is too much connected with the BSD socket 
API.  I think that asio should go well beyond it and not limit itself to 
the system socket types (not that it does, but the names might seem to 
imply that). This is just a personal preference though.

Also i think there should be *no* asio::stream_socket. This is my major 
The current stream socket should be in the namespace asio::ipv4, and 
should be a different type for example from an eventual 
asio::posix_pipe::stream_socket or asio::unix::stream_socket. Especially 
the last can be currently trivially implemented by defining the 
appropriate protocol. This means that a stream_socket initialized with a 
  ipv4::tcp protocol will interoperate with a stream_socket initialized 
with a unix::stream protocol.  For example, currently i can use my 
hipotethical unix::stream with ipv4::resolver. Asio should not be type 
unsafe only because the BSD API is. Of course both should share the same 
code, but it should be an implementation detail.

I like the sync and asynch read/write interface and I have no problem 
with them being in the same object (i.e. no separation of asynch socket 
and sync socket). I agree that the proactor patter is the more portable 
than the reactor one, especially if asio is extended to filesystem i/o. 
I think the interface could be extended for better efficiency (i will 
try to describe later how), but this can be done at a later stage.

I don't really like a lot the service concept.  While I agree that the 
service repository is useful to hold the various proactor 
implementations (iocp_service, reactive_service<kqueue>, 
reactive_service<epoll>, etc...), i don't think that the streams should 
care about them. As long as the impl_type are interoperable, any stream 
should be bindable with any service.  This also means that the binding 
with the demuxer should *not* be done at creation time, but at open 
time.  This will make it possible to have different openers: connector, 
acceptor, unix::socket_pair, posix::pipe etc. The openers are instead 
created with a demuxer reference and will pass it to the socket when 
opened.  By the way, the implementation of socket functions (accept, 
connect, read, write, etc) should not be members of the demuxer service 
but free functions or, better, static members of a policy class.

The buffer concept does not really add much, simply using void pointers 
and size_t length parameters might require less conceptual overhead. 
They don't add much security (it should be the responsibility of higher 
layers along with buffer managements), and in practice don't help a lot 
with scatter/gather i/o: often you can't reuse an existing vector of 
iovecs, because you need to do an operation from/to the middle of it and 
thus you need to build a temporary one. As the iovector is usually small 
(16 elements may be?) and can be stack allocated, I think that having a 
global buffer type is a premature optimization.  Instead there should be 
a *per stream type*, because a custom stream might use a different 
iovector implementation that the system one ( io_vec or whatever). It 
should have at least push_back(void*, size_t) and size() members. 
Scatter gather operations should accept an iterator to first element of 
the operation (usually the first element of the vector).

The current interface let the user put a socket in non blocking mode, 
but there is not much that can be done with that, because no reactor is 
exported. The various reactors/proactor implementations should be 
removed from the detail namespace and be promoted to public interfaces, 
albeit in their own namespace (i.e. win32::iocp_proactor, 
posix::select_reactor, linux::epoll_reactor etc...).  This change will 
make the library a lot more useful.

The buffered stream is almost useless. Any operation requires two 
copies, one from kernel to user space, and one from the internal buffer 
to the user buffer.  The internal buffer should be unlimited in length 
(using some kind of deque) and accessible to eliminate copies. An 
interface for i/o that does not require copying would be generally 
usefull and not limited to buffered streams.

- What is your evaluation of the implementation?

Good. The code is very clean and readable. Often is more useful than the 
documentation.  There are inefficiencies caused by too much dynamic 
allocations, mutexes and unnecessary system calls, but they are not 
intrinsic in the desing and I see from other posting that they are 
currently being addresed by the author.

Probably there should be more code reuse. For example the 
asio::ssl::stream duplicates much of the code of asio::stream_base.

I think that the buffer function does not support vectors with custom 
allocators, but i might be missing something.

On some systems there is already an asynch resolver call (glibc provides 
  getaddrinfo_a).  Asio should use them instead of its own thread based 
emulation.  The glibc api still use threads, but a strictly single 
threaded implementation is theoretically possible.

The implemenation is header only. It makes the library easier to use, 
but network system headers are known to pollute the enviroment :)

- What is your evaluation of the documentation?

Insufficient. One of the best thing of boost is the quality of its 
documentation. I don't think that asio is on par with boost standards. 
While the examples are good and well described, the API documentation is 
not very useful. It really shows its autogenerated-ness :)

- What is your evaluation of the potential usefulness of the library?

Extremely useful. There have been lots of requests and proposal for a 
boost.net library. Asio is certanly the most promising.

- Did you try to use the library?

While I didn't use all of it (No timers nor SSL), as an experiment I did 
write a simple continuation library using asio::demuxer as a scheduler 
and using asio callback pattern to restart coroutines waiting for i/o. 
Asio's demuxer cleaness and callback guarantees made the implementation 
very straightforward.  If some one is interested i may upload the code 
somewhere. Currently it is posix only (it uses the makecontext family of 
system calls), but it should be fairly easy to implement win32 fiber 
support.

- How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?

In depth. I've read most of the documentation and studied the 
implementation, or at leas part of it.

- Are you knowledgeable about the problem domain?

Yes, a bit. I've written my own network library, although synchronous 
only.  I've studied the asynch problem in depth, but I've only 
theorethical knowledge of it. I did not add async support to my library 
  because I couldn't find a good async interface.

----

At this point the review if finished.  I'will continue with a wish list 
of the things i would like to see in asio, but don't necessarily belong 
to a 1.0 version:

- Asynchornous completion Tokens. While callbacks are useful most of the 
time, sometimes ACT are a bettere solution.  It would be nice if asio 
did let the user specify an act instead of a callback to async 
functions. Then the demuxer would act as a container of ACTs that could 
be enumerated.

- Lazy allocation of buffers. Instead of passing a buffer or a list of 
buffers to an async io function, a special allocator is passed. When the 
buffer is needed (before the async syscall on real proactors and before 
the non blocking call, but after polling  on emulated proactors), the 
allocator is called and it will return a buffer to be used for the 
operation.  The buffer is then passed to the handler.  It will eliminate 
the need to commit buffer memory for inactive connections.  This mostly 
makes sense for reading, but it might be extended to writing too. This 
optimization make it possible to add...

- Multishot calls.  Instead of the regular "exactly one callback for 
async_call", a client might register a callback for multiple wakeups. 
For example a async_multi_read_some will be called with a buffer 
allocator and a callback as parameter and will return a registration 
token. Every time some data is available from the socket, a buffer is 
allocated, data is read and a callback is done.  Note that the same 
callback is reused, so there is no need to allocate and copy multiple 
callbacks.  This optimization, for example, will greatly reduce the 
number of syscalls done in reactive demuxer.  The notification of 
interest to the system demuxer is only done once, then as long as the 
read syscall does not block, no other syscalls need to be done.
This will probably be useful mostly with datagram servers.
The same optimization can be done to async_accept.

- Relinquising ownership of buffers or lending them.  While the normal 
BSD api requires the copying of buffers anyway, there are streams that 
can be implemented without copying.  For example a process-local shared 
memory stream can be implemented by simply moving pointer to buffers 
between threads.  A buffered stream might take ownership of a supplied 
buffer and add it to its own buffer list.  Giving and taking ownership 
is not the only action possible, but a immutable buffer could be loaned 
to a stream. This is useful for caching.   For example, when reading 
from a file a cached file stream can be used. The stream will check in 
the cache if the file page is present, and if so, it will return a 
shared, reference counted, pointer to the page as an immutable buffer. 
As long as the caller only reads from the supplied buffer (for example 
to send it to a remote host), no copying is required.

I think that's all ... for now :).  Sorry for the long review.

---

Giovanni P. Deretta

Giovanni P. Deretta

Christopher Kohlhoff

Giovanni P. Deretta

Christopher Kohlhoff

Giovanni P. Deretta

Christopher Kohlhoff

Giovanni P. Deretta

Jose

Darryl Green

Rene Rivera

Giovanni P. Deretta

tags

participants (5)