
Christopher Kohlhoff wrote:
I believe there might be more similarity there already than you think, so I'm going to do a bit of experimentation and see what comes out of it. However the dual support for IPv4 and IPv6 raised by Jeff might be a bit of a problem -- is it something you address in your network lib?
(looks at old code......) Yes, more or less, that is, the internal implementation has support for ipv6, but is not exported in the public interface, but it is a matter of instantiating a stream_template<inet::ipv6>. Unfortunately i didn't test it (i have no ipv6 experience), but reading from the Stevens book and from the SUSv3, it seems that ipv6 sockets are backward compatible with ipv4 (i.e. ipv6 address resolvers can take ipv4 addresses, and ipv6 sockets can connect and accept ipv4 streams). I think that the cleaner interface should be to have an ip::stream that is instantiated as a ipv6::stream if there is system support, or as an ipv4::stream if there is none. ipv4::stream and ipv6::stream should be available if the user explicitly needs them (i.e. no compatibility), but the default should be to use the ip::stream.
I see what you mean now, however I don't think they can be portably decoupled. Some platforms will require access to shared state in order to perform the operation. The acceptor socket caching you mentioned is also a case for having access to this.
Sometimes you need to make some hard decision. There are any real-life protocol/platform that need shared state? If yes, than member functions is fine. If not, or only some theoretical obscure protocol needs it, it should not impact the generic interface. For those obscure/rarely used protocol/platforms a hidden singleton could be used. Any way, even if you keep the shared state solution, i think that the proactor should be decoupled from the protocol implementation. Using your current model, the protocol_service and the proactor_service should be two different objects. In fact multiple protocols must be able to reuse the same proactor.
I suspect that, if I adopt a type-per-protocol model, the service will also be associated with the protocol in some way (i.e. the protocol might be a template parameter of the service class).
seems good... probably is the same thing i proposed in the previous paragraph.
<snip>
Of course non-blocking operations are useful, but as there is no public readiness notification interface (i.e. a reactor), its use is somewhat complicated.
What I mean is that the readiness notification isn't required, since one way of interpreting an asynchronous operation is "perform this operation when the socket is ready". That is, it corresponds to the non-blocking operation that you would have made when notified that a socket was ready. A non-blocking operation can then be used for an immediate follow-up operation, if desired.
<snip>
What about parametrizing the buffered_stream with a container type, and providing an accessor to this container? The container buffer can then be swap()ed, splice()ed, reset()ed, fed to algorithms, and much more without any copying, while still preserving the stream interface. Instead of a buffered stream you can think of it as a stream adaptor to for containers. I happily used one in my library, and it really simplifies code, along with a deque that provides segmented iterators to the contiguous buffers.
I think a separate stream adapter for containers sounds like a good plan.
I have to add that my buffered_adapter is actually more than a stream adapter for container, because it has an associated stream, and can bypass the buffer is the read or write request is big enough. Also the adapter has underflow() (you call it fill in your buffered_stream) and flush().
BTW, is there a safe way to read data directly into a deque? Or do you mean that the deque contains multiple buffer objects?
No, there is no portable way to read data in a std::deque<char>. But I did not use the std::deque, I've made my own with segmented iterators and not default construction of pods. Actually it only work with pods right now, but it is fairly complete, I've have even added versions of some standard algos with support for segmented iterators. It should be not to hard to add non-pod support (not that a net lib really needs it...).
i'm already thinking of possible extensions... shared_buffers, gift_buffers (i need a better name for the last one) and more.
I take it that by "shared_buffers" you mean reference-counted buffers?
Yes, exactly.
If so, one change I'm considering is to add a guarantee that a copy of the Mutable_Buffers or Const_Buffers object will be made and kept until an asynchronous operation completes. At the moment a copy is only kept until it is used (which for Win32 is when the overlapped I/O operation is started, not when it ends).
Hum, nice, but if you want to forward a buffer another thread, for example, you want to forward the complete type (to preserve the counter and the acquire()/release() infrastructure). I think that it should be possible to implement some streams that in addition to generic buffer object can accept specialized per stream buffer types and guarantee special treatment for them. For example if write() is called for inprocess shared memory stream, will copy the buffer if a generic buffer is passed, but will give special treatment if a special shared buffer is passed. I've in mind only shared memory streams for now, but think about a network transport implemented completely in userspace (with direct access to the net card): it is theoretically possible to DMA directly from user buffers to the card buffer, but it might only be possible from some specially aligned memory. Case in point: the linux aio implementation requires that a file is opened in O_DIRECT mode. In turn O_DIRECT requires that the supplied buffer is aligned to 512 byte boundaries (or filesystem block size for 2.4). This means that a an asio based asynch disk io subsystem for asio would require its buffers be specially allocated (or fall back to do an extra copy). This requirement can easily be met if an hipotethical asio::fs_stream has a direct_buffer_allocator typedef. The allocator would return objects of type direct_buffer, and fs_stream.async_{read|write}_some would be overloaded to explicitly support these buffers. If a direct_buffer is used, fs_stream will use the native linux aio. If a generic_buffer is not used, fs_stream should *not* use the linux aio, not even with an internal properly allocated bounce buffer, because direct bypasses system caches, so it should be done only if the user explicitly request it by using direct_buffers. The fall back should probably use worker threads. Btw, future versions of linux aio will almost certainly support non-direct asynch io. Still the O_DIRECT mode will probably be fast-pathed. In the end this boils down to passing the exact buffer type to the lower levels of asio implementation.
However, this may make using a std::vector<> or std::list<> of buffers too inefficient, since a copy must be made of the entire vector or list object. I will have to do some measurements before making a decision, but it may be that supporting reference-counted buffers is a compelling enough reason.
Usually the vector is small, and probably boost::array is a best fit. In the last case, the buffer is cached (as it is in the stack), is very small (less that 20 elements) and it takes very little time to copy it. In case of vectors, if move semantics are available (because they are emulated by the standard library, like the next libstdc++ does, or because of future language development), no copy is needed.
Btw, did you consider my proposal for multishot calls?
I have now :)
I think that what amounts to the same thing can be implemented as an adapter on top of the existing classes. It would work in conjunction with the new custom memory allocation interface to reuse the same memory. In a way it would be like a simplified interface to custom memory allocation, specifically for recurring operations. I'll add it to my list of things to investigate.
I don't think it is worth to do it at higher levels. Multishot calls are inconvenient because you lose the guarantee one call -> one callback. I proposed to add them because they can open many optimization opportunities at lower levels (reduced system calls to allocate the callback, may be better cache locality of callback data and less syscalls to register readiness notification interest). Ah, btw, happy new year :) --- Giovanni P. Deretta ---