
Hi everybody, I hope not to have missed the review deadline... First of all I vote *not* to accept asio in boost yet. I think that the library is very promising but there are some issues. If the library is not accepted, is strongly encourage the author to subimit it again once the issues are resolved. - What is your evaluation of the design? Promising. I think that the most important innovation of asio is the dispatcher concept and the async call pattern. Everything else is just *extra*. It probably should be removed in some way from asio and made a separate boost library (or may be just rename asio to something like "asynch lib" or whatever). The interface provided by asio is very low level, but i think this is a strenght of asio, not a weakness. Higher level interfaces (iostreams, buffer management etc...) can be added later as external components or extensions to the library itself. The {datagram|stream}_socket interface is nice, but i have some issues with them. First of all the name "socket" is too much connected with the BSD socket API. I think that asio should go well beyond it and not limit itself to the system socket types (not that it does, but the names might seem to imply that). This is just a personal preference though. Also i think there should be *no* asio::stream_socket. This is my major The current stream socket should be in the namespace asio::ipv4, and should be a different type for example from an eventual asio::posix_pipe::stream_socket or asio::unix::stream_socket. Especially the last can be currently trivially implemented by defining the appropriate protocol. This means that a stream_socket initialized with a ipv4::tcp protocol will interoperate with a stream_socket initialized with a unix::stream protocol. For example, currently i can use my hipotethical unix::stream with ipv4::resolver. Asio should not be type unsafe only because the BSD API is. Of course both should share the same code, but it should be an implementation detail. I like the sync and asynch read/write interface and I have no problem with them being in the same object (i.e. no separation of asynch socket and sync socket). I agree that the proactor patter is the more portable than the reactor one, especially if asio is extended to filesystem i/o. I think the interface could be extended for better efficiency (i will try to describe later how), but this can be done at a later stage. I don't really like a lot the service concept. While I agree that the service repository is useful to hold the various proactor implementations (iocp_service, reactive_service<kqueue>, reactive_service<epoll>, etc...), i don't think that the streams should care about them. As long as the impl_type are interoperable, any stream should be bindable with any service. This also means that the binding with the demuxer should *not* be done at creation time, but at open time. This will make it possible to have different openers: connector, acceptor, unix::socket_pair, posix::pipe etc. The openers are instead created with a demuxer reference and will pass it to the socket when opened. By the way, the implementation of socket functions (accept, connect, read, write, etc) should not be members of the demuxer service but free functions or, better, static members of a policy class. The buffer concept does not really add much, simply using void pointers and size_t length parameters might require less conceptual overhead. They don't add much security (it should be the responsibility of higher layers along with buffer managements), and in practice don't help a lot with scatter/gather i/o: often you can't reuse an existing vector of iovecs, because you need to do an operation from/to the middle of it and thus you need to build a temporary one. As the iovector is usually small (16 elements may be?) and can be stack allocated, I think that having a global buffer type is a premature optimization. Instead there should be a *per stream type*, because a custom stream might use a different iovector implementation that the system one ( io_vec or whatever). It should have at least push_back(void*, size_t) and size() members. Scatter gather operations should accept an iterator to first element of the operation (usually the first element of the vector). The current interface let the user put a socket in non blocking mode, but there is not much that can be done with that, because no reactor is exported. The various reactors/proactor implementations should be removed from the detail namespace and be promoted to public interfaces, albeit in their own namespace (i.e. win32::iocp_proactor, posix::select_reactor, linux::epoll_reactor etc...). This change will make the library a lot more useful. The buffered stream is almost useless. Any operation requires two copies, one from kernel to user space, and one from the internal buffer to the user buffer. The internal buffer should be unlimited in length (using some kind of deque) and accessible to eliminate copies. An interface for i/o that does not require copying would be generally usefull and not limited to buffered streams. - What is your evaluation of the implementation? Good. The code is very clean and readable. Often is more useful than the documentation. There are inefficiencies caused by too much dynamic allocations, mutexes and unnecessary system calls, but they are not intrinsic in the desing and I see from other posting that they are currently being addresed by the author. Probably there should be more code reuse. For example the asio::ssl::stream duplicates much of the code of asio::stream_base. I think that the buffer function does not support vectors with custom allocators, but i might be missing something. On some systems there is already an asynch resolver call (glibc provides getaddrinfo_a). Asio should use them instead of its own thread based emulation. The glibc api still use threads, but a strictly single threaded implementation is theoretically possible. The implemenation is header only. It makes the library easier to use, but network system headers are known to pollute the enviroment :) - What is your evaluation of the documentation? Insufficient. One of the best thing of boost is the quality of its documentation. I don't think that asio is on par with boost standards. While the examples are good and well described, the API documentation is not very useful. It really shows its autogenerated-ness :) - What is your evaluation of the potential usefulness of the library? Extremely useful. There have been lots of requests and proposal for a boost.net library. Asio is certanly the most promising. - Did you try to use the library? While I didn't use all of it (No timers nor SSL), as an experiment I did write a simple continuation library using asio::demuxer as a scheduler and using asio callback pattern to restart coroutines waiting for i/o. Asio's demuxer cleaness and callback guarantees made the implementation very straightforward. If some one is interested i may upload the code somewhere. Currently it is posix only (it uses the makecontext family of system calls), but it should be fairly easy to implement win32 fiber support. - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? In depth. I've read most of the documentation and studied the implementation, or at leas part of it. - Are you knowledgeable about the problem domain? Yes, a bit. I've written my own network library, although synchronous only. I've studied the asynch problem in depth, but I've only theorethical knowledge of it. I did not add async support to my library because I couldn't find a good async interface. ---- At this point the review if finished. I'will continue with a wish list of the things i would like to see in asio, but don't necessarily belong to a 1.0 version: - Asynchornous completion Tokens. While callbacks are useful most of the time, sometimes ACT are a bettere solution. It would be nice if asio did let the user specify an act instead of a callback to async functions. Then the demuxer would act as a container of ACTs that could be enumerated. - Lazy allocation of buffers. Instead of passing a buffer or a list of buffers to an async io function, a special allocator is passed. When the buffer is needed (before the async syscall on real proactors and before the non blocking call, but after polling on emulated proactors), the allocator is called and it will return a buffer to be used for the operation. The buffer is then passed to the handler. It will eliminate the need to commit buffer memory for inactive connections. This mostly makes sense for reading, but it might be extended to writing too. This optimization make it possible to add... - Multishot calls. Instead of the regular "exactly one callback for async_call", a client might register a callback for multiple wakeups. For example a async_multi_read_some will be called with a buffer allocator and a callback as parameter and will return a registration token. Every time some data is available from the socket, a buffer is allocated, data is read and a callback is done. Note that the same callback is reused, so there is no need to allocate and copy multiple callbacks. This optimization, for example, will greatly reduce the number of syscalls done in reactive demuxer. The notification of interest to the system demuxer is only done once, then as long as the read syscall does not block, no other syscalls need to be done. This will probably be useful mostly with datagram servers. The same optimization can be done to async_accept. - Relinquising ownership of buffers or lending them. While the normal BSD api requires the copying of buffers anyway, there are streams that can be implemented without copying. For example a process-local shared memory stream can be implemented by simply moving pointer to buffers between threads. A buffered stream might take ownership of a supplied buffer and add it to its own buffer list. Giving and taking ownership is not the only action possible, but a immutable buffer could be loaned to a stream. This is useful for caching. For example, when reading from a file a cached file stream can be used. The stream will check in the cache if the file page is present, and if so, it will return a shared, reference counted, pointer to the page as an immutable buffer. As long as the caller only reads from the supplied buffer (for example to send it to a remote host), no copying is required. I think that's all ... for now :). Sorry for the long review. --- Giovanni P. Deretta

Hi Giovanni, --- "Giovanni P. Deretta" <gpderetta@gmail.com> wrote: <snip>
Promising. I think that the most important innovation of asio is the dispatcher concept and the async call pattern. Everything else is just *extra*. It probably should be removed in some way from asio and made a separate boost library (or may be just rename asio to something like "asynch lib" or whatever).
The dispatcher concept could certainly be reimplemented in a separate library without being coupled to I/O. <snip>
First of all the name "socket" is too much connected with the BSD socket API. I think that asio should go well beyond it and not limit itself to the system socket types (not that it does, but the names might seem to imply that). This is just a personal preference though.
I have deliberately retained the feel of the BSD socket API because it is widely supported by literature, even to the extent that it is followed in other programming languages' networking libraries. I'm going to reorder the next bits because I think there is a valuable idea here...
Also i think there should be *no* asio::stream_socket. This is my major The current stream socket should be in the namespace asio::ipv4, and should be a different type for example from an eventual asio::posix_pipe::stream_socket or asio::unix::stream_socket. Especially the last can be currently trivially implemented by defining the appropriate protocol. This means that a stream_socket initialized with a ipv4::tcp protocol will interoperate with a stream_socket initialized with a unix::stream protocol. For example, currently i can use my hipotethical unix::stream with ipv4::resolver. Asio should not be type unsafe only because the BSD API is. Of course both should share the same code, but it should be an implementation detail. ... This also means that the binding with the demuxer should *not* be done at creation time, but at open time.
I don't know why I didn't think of this before! It's actually a small change to the interface overall, but I do believe it gives a net gain in usability. Basically the protocol class can become a template parameter of basic_stream_socket. Then, for example, the asio::ipv4::tcp class would be changed to include a socket typedef: class tcp { public: ... class endpoint; typedef basic_stream_socket<tcp> socket; }; Then in user code you would write: asio::ipv4::tcp::socket sock; Now any constructor that takes an io_service (the new name for demuxer) is an opening constructor. So in basic_stream_socket you would have: template <typename Protocol, ...> class basic_stream_socket { ... // Non-opening constructor. basic_stream_socket(); // Opening constructor. explicit basic_stream_socket(io_service_type& io, const Protocol& protocol = Protocol()); // Explicit open. void open(io_service_type& io, const Protocol& protocol = Protocol()); ... }; This basic_stream_socket template would in fact be an implementation of a Socket concept. Why is this important? Because it improves portability by not assuming that a Protocol::socket type will actually be implemented using the platform's sockets API. Take the Bluetooth RFCOMM protocol for example: namespace bluetooth { class rfcomm { ... class endpoint; typedef implementation_defined socket; }; } // namespace bluetooth Here the socket type is implementation defined, because although on some platforms it can be implemented using BSD sockets (e.g. Windows XP SP2), on others it requires a different API or a third party stack. The only wrinkle is in something like accepting a socket. At the moment the socket being accepted must not be open. That's fine, except that I think it is important to allow a different io_service (demuxer) to be specified for the new socket to the acceptor, to allow partitioning of work across io_service objects. I suspect the best way to do this is to have overloads of the accept function: // Use same io_service as acceptor. acceptor.accept(new_socket); // Use separate io_service for new socket. acceptor.accept(new_socket, other_io_service); An alternative is to require the socket to be opened before calling accept (this is what Windows does with AcceptEx) but I think that makes the common case less convenient. <snip>
By the way, the implementation of socket functions (accept, connect, read, write, etc) should not be members of the demuxer service but free functions or, better, static members of a policy class.
I do think these functions at the lowest level should be member functions rather than free functions, particularly because in the case of async functions it makes it clearer that the result will be delivered through the associated io_service (demuxer).
The buffer concept does not really add much, simply using void pointers and size_t length parameters might require less conceptual overhead. They don't add much security (it should be the responsibility of higher layers along with buffer managements), and in practice don't help a lot with scatter/gather i/o: often you can't reuse an existing vector of iovecs, because you need to do an operation from/to the middle of it and thus you need to build a temporary one. As the iovector is usually small (16 elements may be?) and can be stack allocated, I think that having a global buffer type is a premature optimization.
I'm not sure I understand you here. Since these operations use the Mutable_Buffers and Const_Buffers concepts, you don't have to use std::vector, but can use boost::array instead to build the list of temporary buffers on the stack.
Instead there should be a *per stream type*, because a custom stream might use a different iovector implementation that the system one ( io_vec or whatever). It should have at least push_back(void*, size_t) and size() members.
I don't see that this buys anything. You still want to be able to specify the memory regions to be operated on in terms of void*/size_t. The buffer classes are equivalent to this, but with the sharp edges removed.
Scatter gather operations should accept an iterator to first element of the operation (usually the first element of the vector).
The Mutable_Buffers and Const_Buffers concepts are already a pair of iterators for this reason. I'll explain below, but in fact I think the buffers as used in asio already allow an implementation of lazy buffer allocation as you request.
The current interface let the user put a socket in non blocking mode, but there is not much that can be done with that, because no reactor is exported.
I think non-blocking mode can still be useful in asynchronous and synchronous designs, since it allows you to issue an operation opportunistically.
The various reactors/proactor implementations should be removed from the detail namespace and be promoted to public interfaces, albeit in their own namespace (i.e. win32::iocp_proactor, posix::select_reactor, linux::epoll_reactor etc...). This change will make the library a lot more useful.
Over time perhaps, but these are already undergoing changes as part of performance changes (without affecting the public interface). They are also secondary to the portable interface, so there are many costs associated with exposing them.
The buffered stream is almost useless. Any operation requires two copies, one from kernel to user space, and one from the internal buffer to the user buffer.
Well yes, but you're trading off the cost of extra copies for fewer system calls.
The internal buffer should be unlimited in length (using some kind of deque)
I don't think it is helpful to allow unlimited growth of buffers, especially with the possibility of denial of service attacks.
and accessible to eliminate copies. An interface for i/o that does not require copying would be generally usefull and not limited to buffered streams.
In the past I exposed the internal buffer, but removed it as I was unhappy with the way it was presented in the interface. I'm willing to put it back if there is a clean, safe way of exposing it. <snip>
I think that the buffer function does not support vectors with custom allocators, but i might be missing something.
Good point, I'll fix that. <snip>
While I didn't use all of it (No timers nor SSL), as an experiment I did write a simple continuation library using asio::demuxer as a scheduler and using asio callback pattern to restart coroutines waiting for i/o. Asio's demuxer cleaness and callback guarantees made the implementation very straightforward. If some one is interested i may upload the code somewhere. Currently it is posix only (it uses the makecontext family of system calls), but it should be fairly easy to implement win32 fiber support.
This sounds very interesting, especially if it was integrated with socket functions somehow so that it automatically yielded to another coroutine when an asynchronous operation was started. <snip>
- Lazy allocation of buffers. Instead of passing a buffer or a list of buffers to an async io function, a special allocator is passed. When the buffer is needed (before the async syscall on real proactors and before the non blocking call, but after polling on emulated proactors), the allocator is called and it will return a buffer to be used for the operation. The buffer is then passed to the handler. It will eliminate the need to commit buffer memory for inactive connections. This mostly makes sense for reading, but it might be extended to writing too. This optimization make it possible to add...
With a slight tightening of the use of the Mutable_Buffers concept, I think this is already possible :) The Mutable_Buffers concept is an iterator range, where the value_type is required to be a "mutable_buffer or be convertible to an instance of mutable_buffer". The key word here is "convertible". As far as I can see, all you need to do is write an implementation of the Mutable_Buffers concept using a container of some hypothetical lazy_mutable_buffer class. It only needs to provide a real buffer at the time when the value_type (lazy_mutable_buffer) is converted to mutable_buffer. Thanks very much for your comments! Cheers, Chris

It is quite late here, so i'll just write a quick reply on some points...
Basically the protocol class can become a template parameter of basic_stream_socket. Then, for example, the asio::ipv4::tcp class would be changed to include a socket typedef:
class tcp { public: ... class endpoint; typedef basic_stream_socket<tcp> socket; };
Then in user code you would write:
asio::ipv4::tcp::socket sock;
Now any constructor that takes an io_service (the new name for demuxer) is an opening constructor. So in basic_stream_socket you would have:
template <typename Protocol, ...> class basic_stream_socket { ... // Non-opening constructor. basic_stream_socket();
// Opening constructor. explicit basic_stream_socket(io_service_type& io, const Protocol& protocol = Protocol());
// Explicit open. void open(io_service_type& io, const Protocol& protocol = Protocol()); ... };
This basic_stream_socket template would in fact be an implementation of a Socket concept. Why is this important? Because it improves portability by not assuming that a Protocol::socket type will actually be implemented using the platform's sockets API. Take the Bluetooth RFCOMM protocol for example :
[...] This change is fundamental i think, basic_stream_socket should be a template that can be generally used to implement al types of protocols, simply defining the protocol policy. It is more or less what i do in my network lib.
The only wrinkle is in something like accepting a socket. At the moment the socket being accepted must not be open. That's fine, except that I think it is important to allow a different io_service (demuxer) to be specified for the new socket to the acceptor, to allow partitioning of work across io_service objects. I suspect the best way to do this is to have overloads of the accept function:
// Use same io_service as acceptor. acceptor.accept(new_socket);
// Use separate io_service for new socket. acceptor.accept(new_socket, other_io_service);
An alternative is to require the socket to be opened before calling accept (this is what Windows does with AcceptEx) but I think that makes the common case less convenient.
Just an idea: the socket_impl-to-be accepted in the windows case might not actually be stored in socket object, but it is inside an internal cache in the acceptor or in the io_service. When accept is called, an already opened socket_impl is asigned to the socket object. When the socket object is closed, its socket_impl is instead returned to the cache. I don't really know much of Winsock, so i can't really say how much it is feasible. [...]
By the way, the implementation of socket functions (accept, connect, read, write, etc) should not be members of the demuxer service but free functions or, better, static members of a policy class.
I do think these functions at the lowest level should be member functions rather than free functions, particularly because in the case of async functions it makes it clearer that the result will be delivered through the associated io_service (demuxer).
May be i wasn't very clear. I'm not saying that the write_some/read_some functions should be free functions (althought that wouldn't be that bad), i'm saing that these functions, instead of forwarding to non-static member functions of the underlying service, should forward to static member functions of the protocol class. This would decouple the demuxer from the i/o functions. [...]
The current interface let the user put a socket in non blocking mode, but there is not much that can be done with that, because no reactor is exported.
I think non-blocking mode can still be useful in asynchronous and synchronous designs, since it allows you to issue an operation opportunistically.
Of course non-blocking operations are useful, but as there is no public readiness notification interface (i.e. a reactor), its use is somewhat complicated.
The various reactors/proactor implementations should be removed from the detail namespace and be promoted to public interfaces, albeit in their own namespace (i.e. win32::iocp_proactor, posix::select_reactor, linux::epoll_reactor etc...). This change will make the library a lot more useful.
Over time perhaps, but these are already undergoing changes as part of performance changes (without affecting the public interface). They are also secondary to the portable interface, so there are many costs associated with exposing them.
Sure, I expect this change to be done in future versions of the library, when both the inteface and the implementation stabilizes, they are not immediately needed.
The buffered stream is almost useless. Any operation requires two copies, one from kernel to user space, and one from the internal buffer to the user buffer.
Well yes, but you're trading off the cost of extra copies for fewer system calls.
The internal buffer should be unlimited in length (using some kind of deque)
I don't think it is helpful to allow unlimited growth of buffers, especially with the possibility of denial of service attacks.
Well, unlimited growth should be possible in principle, but the buffered stream might have some way to set the maximum length.
and accessible to eliminate copies. An interface for i/o that does not require copying would be generally usefull and not limited to buffered streams.
In the past I exposed the internal buffer, but removed it as I was unhappy with the way it was presented in the interface. I'm willing to put it back if there is a clean, safe way of exposing it.
What about parametrizing the buffered_stream with a container type, and providing an accessor to this container? The container buffer can then be swap()ed, splice()ed, reset()ed, fed to algorithms, and much more without any copying, while still preserving the stream interface. Instead of a buffered stream you can think of it as a stream adaptor to for containers. I happily used one in my library, and it really simplifies code, along with a deque that provides segmented iterators to the contiguous buffers.
While I didn't use all of it (No timers nor SSL), as an experiment I did write a simple continuation library using asio::demuxer as a scheduler and using asio callback pattern to restart coroutines waiting for i/o. Asio's demuxer cleaness and callback guarantees made the implementation very straightforward. If some one is interested i may upload the code somewhere. Currently it is posix only (it uses the makecontext family of system calls), but it should be fairly easy to implement win32 fiber support.
This sounds very interesting, especially if it was integrated with socket functions somehow so that it automatically yielded to another coroutine when an asynchronous operation was started.
Actually you don't want to integrate it with the socket functions. Only because you have a stack, does not means you want to limit your self to sync functions, for example, here is the code of a forwarding continuation, it read from a stream and writes to another: void forwarder(int counter, asio::stream_socket& sink, asio::stream_socket& source) { .... condition_node main_loop(scheduler, continuation::wait_any); const size_t token_size = /* some size */; char token[token_size]; boost::optional<asio::error> read_error; boost::optional<asio::error> write_error; std::size_t write_size = 0; std::size_t read_size = 0; while(counter) { if(write_error) { if(*write_error) { break; } write_error = error_type(); boost::asio::async_write(sink, boost::asio::buffer(token, token_size), scheduler.current_continuation (main_loop.leaf(), write_error, write_size)); counter--; } if(read_error) { if(*read_error) { break; } read_error = error_type(); boost::asio::async_read(source, boost::asio::buffer(token, token_size), scheduler.current_continuation (main_loop.leaf(), read_error, read_size)); } main_loop.wait(); } sink.close(); main_loop.join_all(); continuation::exit(); } I'm using a variant to hold the error code, so i can see if an async call has finished (btw, asio works very well with error variants and this patterns look promising). Scheduler is a global object (but it needs not to be so, it is just for simplicity), that is simply a list of ready coroutines. I could just use the demuxer as scheduler, buth then i would need some extra context switches. Using two schedulers is better. Current continuation returns a functor that when called will signal a condition object. The condition object is created by a condition node and it is linked to it. When a condition object is signaled, it signals its parent condition node. What the node do depends from its current mode. If it is in wait_any mode, it will immediately queue the current continuation at the end of the scheduler queue. If it is in wait_all mode, it will queue the continuation only if all child are signaled. Multiple node can be nested an a complex tree can be built. Calling wait() on any node (or even on a leaf), will remove the coroutine from the ready list and run the next ready coroutine. By themselves conditions object do not hold any state, it is the combination of a condition and a variant that make them very powerful. You might think of them as futures. It might look somewhat complicated, but the implementation is straightforward, about a thousand lines of code. [follow some rant of mine about lazy allocation of buffers...]
With a slight tightening of the use of the Mutable_Buffers concept, I think this is already possible :)
The Mutable_Buffers concept is an iterator range, where the value_type is required to be a "mutable_buffer or be convertible to an instance of mutable_buffer". The key word here is "convertible". As far as I can see, all you need to do is write an implementation of the Mutable_Buffers concept using a container of some hypothetical lazy_mutable_buffer class. It only needs to provide a real buffer at the time when the value_type (lazy_mutable_buffer) is converted to mutable_buffer.
Hum, you are almost certanly right, I will experiment with it one of these days. Btw, my issues with the mutable buffers were because i misanderstood them as wrappers around an io_vec (i should have read the implementation). I did miss the term *concept*. Now I have changed my mind and i think it is very powerful. In my library i've used iterators to delimit io ranges, but as two diferent paramenters. Wrapping them in a pair or a range make them much more useful and exensibile... i'm already thinking of possible extensions... shared_buffers, gift_buffers (i need a better name for the last one) and more. Btw, did you consider my proposal for multishot calls?
Thanks very much for your comments!
You are very wellcome, I hope i've been helpful. --- Giovanni P. Deretta

Hi Giovanni, --- "Giovanni P. Deretta" <gpderetta@gmail.com> wrote:
This change is fundamental i think, basic_stream_socket should be a template that can be generally used to implement al types of protocols, simply defining the protocol policy. It is more or less what i do in my network lib.
I believe there might be more similarity there already than you think, so I'm going to do a bit of experimentation and see what comes out of it. However the dual support for IPv4 and IPv6 raised by Jeff might be a bit of a problem -- is it something you address in your network lib? <snip>
May be i wasn't very clear. I'm not saying that the write_some/read_some functions should be free functions (althought that wouldn't be that bad), i'm saing that these functions, instead of forwarding to non-static member functions of the underlying service, should forward to static member functions of the protocol class. This would decouple the demuxer from the i/o functions.
I see what you mean now, however I don't think they can be portably decoupled. Some platforms will require access to shared state in order to perform the operation. The acceptor socket caching you mentioned is also a case for having access to this. I suspect that, if I adopt a type-per-protocol model, the service will also be associated with the protocol in some way (i.e. the protocol might be a template parameter of the service class). <snip>
Of course non-blocking operations are useful, but as there is no public readiness notification interface (i.e. a reactor), its use is somewhat complicated.
What I mean is that the readiness notification isn't required, since one way of interpreting an asynchronous operation is "perform this operation when the socket is ready". That is, it corresponds to the non-blocking operation that you would have made when notified that a socket was ready. A non-blocking operation can then be used for an immediate follow-up operation, if desired. <snip>
What about parametrizing the buffered_stream with a container type, and providing an accessor to this container? The container buffer can then be swap()ed, splice()ed, reset()ed, fed to algorithms, and much more without any copying, while still preserving the stream interface. Instead of a buffered stream you can think of it as a stream adaptor to for containers. I happily used one in my library, and it really simplifies code, along with a deque that provides segmented iterators to the contiguous buffers.
I think a separate stream adapter for containers sounds like a good plan. BTW, is there a safe way to read data directly into a deque? Or do you mean that the deque contains multiple buffer objects?
Actually you don't want to integrate it with the socket functions. Only because you have a stack, does not means you want to limit your self to sync functions, for example, here is the code of a forwarding continuation, it read from a stream and writes to another: <snip> It might look somewhat complicated, but the implementation is straightforward, about a thousand lines of code.
Wow, neat. <snip>
i'm already thinking of possible extensions... shared_buffers, gift_buffers (i need a better name for the last one) and more.
I take it that by "shared_buffers" you mean reference-counted buffers? If so, one change I'm considering is to add a guarantee that a copy of the Mutable_Buffers or Const_Buffers object will be made and kept until an asynchronous operation completes. At the moment a copy is only kept until it is used (which for Win32 is when the overlapped I/O operation is started, not when it ends). However, this may make using a std::vector<> or std::list<> of buffers too inefficient, since a copy must be made of the entire vector or list object. I will have to do some measurements before making a decision, but it may be that supporting reference-counted buffers is a compelling enough reason.
Btw, did you consider my proposal for multishot calls?
I have now :) I think that what amounts to the same thing can be implemented as an adapter on top of the existing classes. It would work in conjunction with the new custom memory allocation interface to reuse the same memory. In a way it would be like a simplified interface to custom memory allocation, specifically for recurring operations. I'll add it to my list of things to investigate. Cheers, Chris

Christopher Kohlhoff wrote:
I believe there might be more similarity there already than you think, so I'm going to do a bit of experimentation and see what comes out of it. However the dual support for IPv4 and IPv6 raised by Jeff might be a bit of a problem -- is it something you address in your network lib?
(looks at old code......) Yes, more or less, that is, the internal implementation has support for ipv6, but is not exported in the public interface, but it is a matter of instantiating a stream_template<inet::ipv6>. Unfortunately i didn't test it (i have no ipv6 experience), but reading from the Stevens book and from the SUSv3, it seems that ipv6 sockets are backward compatible with ipv4 (i.e. ipv6 address resolvers can take ipv4 addresses, and ipv6 sockets can connect and accept ipv4 streams). I think that the cleaner interface should be to have an ip::stream that is instantiated as a ipv6::stream if there is system support, or as an ipv4::stream if there is none. ipv4::stream and ipv6::stream should be available if the user explicitly needs them (i.e. no compatibility), but the default should be to use the ip::stream.
I see what you mean now, however I don't think they can be portably decoupled. Some platforms will require access to shared state in order to perform the operation. The acceptor socket caching you mentioned is also a case for having access to this.
Sometimes you need to make some hard decision. There are any real-life protocol/platform that need shared state? If yes, than member functions is fine. If not, or only some theoretical obscure protocol needs it, it should not impact the generic interface. For those obscure/rarely used protocol/platforms a hidden singleton could be used. Any way, even if you keep the shared state solution, i think that the proactor should be decoupled from the protocol implementation. Using your current model, the protocol_service and the proactor_service should be two different objects. In fact multiple protocols must be able to reuse the same proactor.
I suspect that, if I adopt a type-per-protocol model, the service will also be associated with the protocol in some way (i.e. the protocol might be a template parameter of the service class).
seems good... probably is the same thing i proposed in the previous paragraph.
<snip>
Of course non-blocking operations are useful, but as there is no public readiness notification interface (i.e. a reactor), its use is somewhat complicated.
What I mean is that the readiness notification isn't required, since one way of interpreting an asynchronous operation is "perform this operation when the socket is ready". That is, it corresponds to the non-blocking operation that you would have made when notified that a socket was ready. A non-blocking operation can then be used for an immediate follow-up operation, if desired.
<snip>
What about parametrizing the buffered_stream with a container type, and providing an accessor to this container? The container buffer can then be swap()ed, splice()ed, reset()ed, fed to algorithms, and much more without any copying, while still preserving the stream interface. Instead of a buffered stream you can think of it as a stream adaptor to for containers. I happily used one in my library, and it really simplifies code, along with a deque that provides segmented iterators to the contiguous buffers.
I think a separate stream adapter for containers sounds like a good plan.
I have to add that my buffered_adapter is actually more than a stream adapter for container, because it has an associated stream, and can bypass the buffer is the read or write request is big enough. Also the adapter has underflow() (you call it fill in your buffered_stream) and flush().
BTW, is there a safe way to read data directly into a deque? Or do you mean that the deque contains multiple buffer objects?
No, there is no portable way to read data in a std::deque<char>. But I did not use the std::deque, I've made my own with segmented iterators and not default construction of pods. Actually it only work with pods right now, but it is fairly complete, I've have even added versions of some standard algos with support for segmented iterators. It should be not to hard to add non-pod support (not that a net lib really needs it...).
i'm already thinking of possible extensions... shared_buffers, gift_buffers (i need a better name for the last one) and more.
I take it that by "shared_buffers" you mean reference-counted buffers?
Yes, exactly.
If so, one change I'm considering is to add a guarantee that a copy of the Mutable_Buffers or Const_Buffers object will be made and kept until an asynchronous operation completes. At the moment a copy is only kept until it is used (which for Win32 is when the overlapped I/O operation is started, not when it ends).
Hum, nice, but if you want to forward a buffer another thread, for example, you want to forward the complete type (to preserve the counter and the acquire()/release() infrastructure). I think that it should be possible to implement some streams that in addition to generic buffer object can accept specialized per stream buffer types and guarantee special treatment for them. For example if write() is called for inprocess shared memory stream, will copy the buffer if a generic buffer is passed, but will give special treatment if a special shared buffer is passed. I've in mind only shared memory streams for now, but think about a network transport implemented completely in userspace (with direct access to the net card): it is theoretically possible to DMA directly from user buffers to the card buffer, but it might only be possible from some specially aligned memory. Case in point: the linux aio implementation requires that a file is opened in O_DIRECT mode. In turn O_DIRECT requires that the supplied buffer is aligned to 512 byte boundaries (or filesystem block size for 2.4). This means that a an asio based asynch disk io subsystem for asio would require its buffers be specially allocated (or fall back to do an extra copy). This requirement can easily be met if an hipotethical asio::fs_stream has a direct_buffer_allocator typedef. The allocator would return objects of type direct_buffer, and fs_stream.async_{read|write}_some would be overloaded to explicitly support these buffers. If a direct_buffer is used, fs_stream will use the native linux aio. If a generic_buffer is not used, fs_stream should *not* use the linux aio, not even with an internal properly allocated bounce buffer, because direct bypasses system caches, so it should be done only if the user explicitly request it by using direct_buffers. The fall back should probably use worker threads. Btw, future versions of linux aio will almost certainly support non-direct asynch io. Still the O_DIRECT mode will probably be fast-pathed. In the end this boils down to passing the exact buffer type to the lower levels of asio implementation.
However, this may make using a std::vector<> or std::list<> of buffers too inefficient, since a copy must be made of the entire vector or list object. I will have to do some measurements before making a decision, but it may be that supporting reference-counted buffers is a compelling enough reason.
Usually the vector is small, and probably boost::array is a best fit. In the last case, the buffer is cached (as it is in the stack), is very small (less that 20 elements) and it takes very little time to copy it. In case of vectors, if move semantics are available (because they are emulated by the standard library, like the next libstdc++ does, or because of future language development), no copy is needed.
Btw, did you consider my proposal for multishot calls?
I have now :)
I think that what amounts to the same thing can be implemented as an adapter on top of the existing classes. It would work in conjunction with the new custom memory allocation interface to reuse the same memory. In a way it would be like a simplified interface to custom memory allocation, specifically for recurring operations. I'll add it to my list of things to investigate.
I don't think it is worth to do it at higher levels. Multishot calls are inconvenient because you lose the guarantee one call -> one callback. I proposed to add them because they can open many optimization opportunities at lower levels (reduced system calls to allocate the callback, may be better cache locality of callback data and less syscalls to register readiness notification interest). Ah, btw, happy new year :) --- Giovanni P. Deretta ---

Hi Giovanni, --- "Giovanni P. Deretta" <gpderetta@gmail.com> wrote:
Sometimes you need to make some hard decision. There are any real-life protocol/platform that need shared state? If yes, than member functions is fine.
An example might be an operating system where the I/O is implemented by a separate process, and communication with this process takes place over some sort of IPC channel. Symbian is an example of an operating system that uses this approach. I want asio's public interface to support an implementation on systems that use this mechanism.
If not, or only some theoretical obscure protocol needs it, it should not impact the generic interface. For those obscure/rarely used protocol/platforms a hidden singleton could be used.
Symbian, as it happens, doesn't support static or global data, so singletons are out :) <snip>
Hum, nice, but if you want to forward a buffer another thread, for example, you want to forward the complete type (to preserve the counter and the acquire()/release() infrastructure).
Yep, this is already possible since the caller's Mutable_Buffers (or Const_Buffers) type is preserved down into the implementation.
I think that it should be possible to implement some streams that in addition to generic buffer object can accept specialized per stream buffer types and guarantee special treatment for them. For example if write() is called for inprocess shared memory stream, will copy the buffer if a generic buffer is passed, but will give special treatment if a special shared buffer is passed. <snip> In the end this boils down to passing the exact buffer type to the lower levels of asio implementation.
Which I do already :) I think special cases (such as aligned or shared-memory buffers) could be handled by implementation overloads (note: not function overloads in the public interface) based on the type of Mutable_Buffers::value_type or Const_Buffers::value_type. That way individual streams can be optimised for specific buffer types without propagating additional overloads back through to the interface. Cheers, Chris

Christopher Kohlhoff wrote: [about passing the buffer type to the lower layers]
Which I do already :)
Very good, i should have read the implementation more carefuly :)
I think special cases (such as aligned or shared-memory buffers) could be handled by implementation overloads (note: not function overloads in the public interface) based on the type of Mutable_Buffers::value_type or Const_Buffers::value_type.
That way individual streams can be optimised for specific buffer types without propagating additional overloads back through to the interface.
Well, i certanly didn't intend that the public interface be specialized for all kind of buffers. I think that the way asio works currently is fine. --- Giovanni P. Deretta

On 12/30/05, Giovanni P. Deretta <gpderetta@gmail.com> wrote:
exported. The various reactors/proactor implementations should be removed from the detail namespace and be promoted to public interfaces, albeit in their own namespace (i.e. win32::iocp_proactor, posix::select_reactor, linux::epoll_reactor etc...). This change will make the library a lot more useful.
I agree. There was at least 1 reviewer also asking for this. I think chris has stated it won't do it but I haven't seen the reasoning
On some systems there is already an asynch resolver call (glibc provides getaddrinfo_a). Asio should use them instead of its own thread based emulation. The glibc api still use threads, but a strictly single threaded implementation is theoretically possible.
I've used getaddrinfo_a in a single user thread. Note you have to use polling becasue there is a bug with the signal interface http://sources.redhat.com/ml/libc-alpha/2005-01/msg00049.html While I didn't use all of it (No timers nor SSL), as an experiment I did
write a simple continuation library using asio::demuxer as a scheduler and using asio callback pattern to restart coroutines waiting for i/o. Asio's demuxer cleaness and callback guarantees made the implementation very straightforward. If some one is interested i may upload the code somewhere.
Please upload it. I found your code very useful to learn advanced c++
Yes, a bit. I've written my own network library, although synchronous only. I've studied the asynch problem in depth, but I've only theorethical knowledge of it. I did not add async support to my library because I couldn't find a good async interface.
Just for educational purposes, what do you mean by "I coudn't find a good async interface" and what are the key ideas that asio has that provides a good async interface I think your advanced knowledge of the topics does provide key insights for asio to be truly useful. I enjoyed reading your review ! Thanks

Giovanni P. Deretta wrote:
I think that the most important innovation of asio is the dispatcher concept and the async call pattern. Everything else is just *extra*. Yes.
The {datagram|stream}_socket interface is nice, but i have some issues with them.
Also i think there should be *no* asio::stream_socket. This is my major The current stream socket should be in the namespace asio::ipv4, and should be a different type for example from an eventual asio::posix_pipe::stream_socket or asio::unix::stream_socket. Yes. I think the interface could be extended for better efficiency (i will try to describe later how), but this can be done at a later stage. Yes. I don't really like a lot the service concept. While I agree that the service repository is useful to hold the various proactor implementations (iocp_service, reactive_service<kqueue>, reactive_service<epoll>, etc...), i don't think that the streams should care about them. As long as the impl_type are interoperable, any stream should be bindable with any service. This also means that the binding with the demuxer should *not* be done at creation time, but at open time. This will make it possible to have different openers: connector, acceptor, unix::socket_pair, posix::pipe etc. The openers are instead created with a demuxer reference and will pass it to the socket when opened. By the way, the implementation of socket functions (accept, connect, read, write, etc) should not be members of the demuxer service but free functions or, better, static members of a policy class. Yes - I was trying to figure out how to fit that into the current design, sounds like a good approach. The current interface let the user put a socket in non blocking mode, but there is not much that can be done with that, because no reactor is exported. The various reactors/proactor implementations should be removed from the detail namespace and be promoted to public interfaces, albeit in their own namespace (i.e. win32::iocp_proactor, posix::select_reactor, linux::epoll_reactor etc...). This change will make the library a lot more useful. Yes.
I just wanted to endorse this in the hope that your slant on it makes more sense than mine. I think (haven't had a proper look at it to be sure) your suggestions address my major concerns with the library. Regards Darryl Green. -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.371 / Virus Database: 267.14.9/217 - Release Date: 30/12/2005

Darryl Green wrote:
Giovanni P. Deretta wrote:
exported. The various reactors/proactor implementations should be removed from the detail namespace and be promoted to public interfaces, albeit in their own namespace (i.e. win32::iocp_proactor, posix::select_reactor, linux::epoll_reactor etc...). This change will make the library a lot more useful.
Yes.
I just wanted to endorse this in the hope that your slant on it makes more sense than mine. I think (haven't had a proper look at it to be sure) your suggestions address my major concerns with the library.
The changes Giovanni mentions are what I was expecting in the first place :-) So I would also like to see such an arrangement. Perhaps I should go look at Giovanni's library even if it is only sync. -- -- Grafik - Don't Assume Anything -- Redshift Software, Inc. - http://redshift-software.com -- rrivera/acm.org - grafik/redshift-software.com -- 102708583/icq - grafikrobot/aim - Grafik/jabber.org

Rene Rivera wrote:
The changes Giovanni mentions are what I was expecting in the first place :-) So I would also like to see such an arrangement. Perhaps I should go look at Giovanni's library even if it is only sync.
It is available at http://sourceforge.net/projects/libstream (the homepage is libstream.sf.net). The available download is old and will not compile on recent versions of gcc. You should download the cvs version. In the homepage, you will find an use example and a link to a paper describing the library. Unfortunately there is no other documentation available (i've never gotten around writing it). Some documentation can be generated with doxygen, but it is highly incomplete. If you want to take a look at some examples, look inside lib/stream/test. I've not updated the library recently, but in the next two months, i will probably do some changes (because of a project i need to finish that uses the lib). I will probably end up using boost::asio for the low level stuff. The biggest difference to asio is that i do not encourage developers to use the {read|write}_some functions (even though they are available), but instead i provide input and output iterators and buffered streams that can be used with my version of the standard algos. Many optimizations are not yet implemented though (like input and output iterators with multiple object-per-asignment support). Hope it is useful. PS: happy new year to everybody! --- Giovanni P. Deretta
participants (5)
-
Christopher Kohlhoff
-
Darryl Green
-
Giovanni P. Deretta
-
Jose
-
Rene Rivera