Aaron's Informal Proposal for a Universal Demultiplexor for Boost

Well here it is. This is a very sketchy "brain dump" of the general form of demultiplexor I have been referring to. I think that it might be a basis for something eventually acceptable as a general demultiplexor for Boost. The general concept comes from a demultiplexor library that I have implemented privately (perhaps it could now be considered a prototype). I've tried to introduce the library design from a standpoint of rationale. This description is quite vague, but hopefully it will be a sufficient starting point. As always, comments of all kinds are appreciated. Aaron's Informal Proposal for a Universal Demultiplexor for Boost <http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Multiplexing/AaronsMultiplexingIdeas> Aaron W. LaFramboise

Hi Aaron I've been following the threads spawned by you and Carlos with great interest! And this proposal is interesting indeed! As far as I can tell, I like your approach. I'm not sure how, but I think I have a hunch how the demultiplexor can be used in conjunction with other components, such as threads, to realize functionalities on a higher-level such as a socket library but I relly would like to see some diagrams and more code and examples - please! You asked us to forget all about event handling, but... I'm not sure whether the event demultiplexing model is synchronous or asynchronous? (Compare the patterns Reactor (sync) and Proactor (async) as described in POSA2. Btw do you consider ACE's implementation of these patterns "big, clunky, monolithic demultiplexors"?) I strongly agree with you that the demultiplexor library should not use threads as those are part of the concurrency model - separation of concerns. However, if needed, it should of possible to combine a synchronous event demultiplexer together with a thread-pool to implement the Leader-Followers pattern via a thread-pool reactor. /Tommy PS. I'm confused regarding the use of the terms multiplexor and demultiplexor. I hope you mean demultiplexor when you say multiplexor - right?

On Tue, Sep 14, 2004 at 09:12:57AM +0000, Tommy wrote:
I'm confused regarding the use of the terms multiplexor and demultiplexor. I hope you mean demultiplexor when you say multiplexor - right?
This is what I assumed, but he has a point here. You (Aaron) propose a class basic_multiplexor, shouldn't that be basic_demultiplexor? Carlo Wood <carlo@alinoe.com> (Carlo, not Carlos).

Carlo Wood wrote:
On Tue, Sep 14, 2004 at 09:12:57AM +0000, Tommy wrote:
I'm confused regarding the use of the terms multiplexor and demultiplexor. I hope you mean demultiplexor when you say multiplexor - right?
This is what I assumed, but he has a point here. You (Aaron) propose a class basic_multiplexor, shouldn't that be basic_demultiplexor?
Yes. However, the difference between a demultiplexor and multiplexor is not entirely obvious to me in this context. (We could invent one, if it were valuable, but I don't think it is). I just like short names. :) Aaron W. LaFramboise

On Tue, Sep 14, 2004 at 03:32:06PM -0500, Aaron W. LaFramboise wrote:
This is what I assumed, but he has a point here. You (Aaron) propose a class basic_multiplexor, shouldn't that be basic_demultiplexor?
Yes. However, the difference between a demultiplexor and multiplexor is not entirely obvious to me in this context. (We could invent one, if it were valuable, but I don't think it is). I just like short names. :)
A multiplexor is something with many inputs and one output. A demultiplexor is something with one input and many outputs. I think demultiplexor is the most obvious because this thing has many outputs: All the different event handler. The single input might be considered to be the single system call in which the thread is sleeping, or just the single 'demultiplexor' object itself. -- Carlo Wood <carlo@alinoe.com>

Aaron, this is more 'point 14' and we were still discussing point 2 :p, but ok. The demultiplexor library that we are trying to design indeed exists of two more or less separated parts: 1) The actual event demultiplexor. 2) The user interface with concepts like 'class socket', 'class file' etc. These two can probably not be entirely designed separate from eachother, because one has to be aware of limitations and benefits of one while designing the other. The observation that I make reading your page is that you have concentrated on point 2) above. Correct me if I am wrong, but I suppose I am right as you yourself say that the 'multiplexor' part is missing ;). Also note that in the thread(s) I started I have not made an attempt yet to start with a design related to point 2. I can however compare your proposal with the Christopher Kohlhoff's asio libraries' user interface (point 2) and I like your interface better. We can summarize it with "a template policy based interface", correct? :). If that includes the concept of using functors as 'callbacks' too that is. However, I think we should delay discussions of THIS type of details (related to point 2 above) a little longer. That does not mean I want to try and ignore your design, not at all! I like it and if its up to me we will use it exactly as you proposed... Like I said, the two parts can probably not be entirely designed separate from eachother - but before we can continue with designing the demultiplexor class interface in the light of how it could work together with the rest of your proposal we should first think about the internals of the demultiplexor and approach the problem from the other side: not from the API side, but from the system resources and efficiency side. So far, my efforts have been aimed enormously at windows as you might have noticed. The reason for that is that I lack full understanding of what is possible with windows - what system-level API exists - and what is the best API to use on windows; while I know everything about the UNIX API: I am not neglecting UNIX, there simply is no need for me to talk about it. If I have to be of any help with this design then I will need help from the windows guru's here to understand The Best Way To Implement Event Demultiplexing On Windows, before I can start with thinking about the design of a basic_demultiplexor class. My questions might seem very one-sided (windows) and very overly detailed (one aspect/detail of the windows API) but please bare with me, I am depending on you, and others, to help me to understand this in the end! :) -- Carlo Wood <carlo@alinoe.com>

On Tue, Sep 14, 2004 at 01:06:29PM +0200, Carlo Wood wrote: [...snip...] That being said, I do have remark on the windows API level that hooks into your page ;). You write: class socket { public: // ... template<typename functor_type> watch on_close(functor_type functor) { return global_multiplexor.on_file_readable(socket_fd, if_then(bind(is_eof, this), functor)); } // ... private: static bool is_eof(socket &); int socket_fd; // ... }; One of the problems that can't seem to put a finger on is the fact that windows uses different types for file handles and for sockets. How can I related HANDLE and SOCKET? And are there any other handle types that I am not aware of? Can SOCKET be casted to a HANDLE just like that? Or even converted without cast? On http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base... I read the following: BOOL ReadFileEx( HANDLE hFile, LPVOID lpBuffer, DWORD nNumberOfBytesToRead, LPOVERLAPPED lpOverlapped, LPOVERLAPPED_COMPLETION_ROUTINE lpCompletionRoutine ); Parameters hFile [...] This parameter can be any handle opened with the FILE_FLAG_OVERLAPPED flag by the CreateFile function, or a socket handle returned by the socket() or accept() function. Note that CreateFile indeed retuns a HANDLE, but socket() and accept() return a SOCKET type?! Does this mean that a SOCKET == HANDLE *and* that they made it impossible for themselfs to ever change that in the future? And how about SOCKET types that are not returned by socket() or accept() (or are those the only windows functions to ever return create a SOCKET type?). -- Carlo Wood <carlo@alinoe.com>

Carlo Wood wrote:
However, I think we should delay discussions of THIS type of details (related to point 2 above) a little longer. That does not mean I want to try and ignore your design, not at all! I like it and if its up to me we will use it exactly as you proposed...
I think it is a flaw to start with select() and wrap it, as I mentioned. Yes I consider ACE big and clunky, and I think it is clunky for this reason. I omitted detailed description of the 'core multiplexor' concept because it doesn't matter. It is trivial to implement for this interface, or most other reasonable interfaces. In my opinion, this is not a detail. This is the essence of the implementation. The value of a demultiplexor library for Boost will be measured in the efficacy and flexability of its interface, not based on whether it has builtin support for kqueue, for eg. Since you've been bringing it up, I feel quite strongly that the core demultiplexor should be based on MsgWaitForMultipleEventsEx in the usual case (when availible). I took Matthew Hurd's comments to heart, and I think another core policy should be based on polling. And another one for IOCP, certainly. For UNIX variants, a similar situation precides, with select() probably being the default, despite its flaws. (New POSIX AIO is another possibility--the Boost community seems to be suspiciously unconcerned with supporting nonstandard legacy UNIX variants.) (I am very interested on whether it is possible to meaningfully separate style of core--polling, select-esque--from the actual OS APIs--UNIX, Win32. So far, I think not, but I would like to.) Carlo Wood wrote:
One of the problems that can't seem to put a finger on is the fact that windows uses different types for file handles and for sockets.
How can I related HANDLE and SOCKET? And are there any other handle types that I am not aware of? Can SOCKET be casted to a HANDLE just like that? Or even converted without cast?
Well, this is pretty much a trivial implementation detail. I would like to express my frustration for the silliness of the situation involving Win32 names, as in practice they tend to work against encapsulation, not for it. Boost.Thread avoids all of the silliness of all of this by simply using void * as storage for handles, which as far as I know will work on any version of Windows in use or forseeable in the future. In my initial implementation

On Tue, Sep 14, 2004 at 03:56:17PM -0500, Aaron W. LaFramboise wrote:
I think it is a flaw to start with select() and wrap it, as I mentioned. Yes I consider ACE big and clunky, and I think it is clunky for this reason.
I definitely do not propose to start with select() and build a demultiplexor around it! I think we have to approach this from two sides however - not ONLY from the side of the API either. What I am trying to say is that it is possible to discuss things of the core that are so essential that they are NOT related to the final API in the sense that we have a choice depending on what the API is, but instead will influence the API. As example: Suppose that the core ONLY supported the Proactor pattern, then we have no choice but to choose the proactor pattern and that will directly influence the final API. While if you'd start with the API and would choose for the Reactor pattern, then you'd have to write an inefficient kludge in order to get that API supported with the available core. The things that I try to discuss are of this kind. I think that the discussion about whether or not we can write the library without threads is essential and not influenced by whatever API you'd pick. It CAN however be of influence to the API.
I omitted detailed description of the 'core multiplexor' concept because it doesn't matter.
I disagree thus.
It is trivial to implement for this interface, or most other reasonable interfaces. In my opinion, this is not a detail. This is the essence of the implementation. The value of a demultiplexor library for Boost will be measured in the efficacy and flexability of its interface, not based on whether it has builtin support for kqueue, for eg.
Agreed. But still - we need to be able to have at least an idea around what kind of core system calls this had to be build. The use or avoidance of threads is an example of great influence. It would be too easy to say "no threads" and design an API that doesn't know about threads, when in the end it turns out that that results in not being able to support multi-process machines in a correct way and for example would lead to a limit of maximal 64 sockets on the windows OS. As a result of this discussion (about threads) so far, it has become clear that we will HAVE to use IO completion routines (the ports seem to demand threads and we try to avoid an API with explicit thread awareness; forcing the user almost to write MT code) - however, I now read about IO completion routines and it turns out to be a proactor pattern interface... We need to see IF we can reasonably combine IO completion routines (ie, WSARecv()) with your API, before going into details of that API.
Since you've been bringing it up, I feel quite strongly that the core demultiplexor should be based on MsgWaitForMultipleEventsEx in the usual case (when availible). I took Matthew Hurd's comments to heart, and I think another core policy should be based on polling. And another one for IOCP, certainly.
It would be nice if you explained why :)
For UNIX variants, a similar situation precides, with select() probably being the default, despite its flaws.
Only when there is nothing else. select() is the oldest interface and does not scale at all. If poll is available then that is what should be used over select, and when epoll or kqueue is available then obviously those should be used! (Have look at the benchmark given on http://monkey.org/~provos/libevent/libevent-benchmark2.jpg) Other famous articles about how bad select(2) is are http://wwwatnf.atnf.csiro.au/people/rgooch/linux/docs/io-events.html http://www.kegel.com/c10k.html
(New POSIX AIO is another possibility--the Boost community seems to be suspiciously unconcerned with supporting nonstandard legacy UNIX variants.)
(I am very interested on whether it is possible to meaningfully separate style of core--polling, select-esque--from the actual OS APIs--UNIX, Win32. So far, I think not, but I would like to.)
I am convinced that any API you can come up with can be supported with the available UNIX system calls, with the exception that select() sets a limit on the number of filedescriptors. Bad well, thats the funny thing - I could code virtually anything on UNIX... If we'd have a windows guru here who can claim the same for windows then I guess could indeed start with the API and forget about the rest for now ;).
Carlo Wood wrote:
One of the problems that can't seem to put a finger on is the fact that windows uses different types for file handles and for sockets.
How can I related HANDLE and SOCKET? And are there any other handle types that I am not aware of? Can SOCKET be casted to a HANDLE just like that? Or even converted without cast?
Well, this is pretty much a trivial implementation detail. I would like to express my frustration for the silliness of the situation involving Win32 names, as in practice they tend to work against encapsulation, not for it.
Boost.Thread avoids all of the silliness of all of this by simply using void * as storage for handles, which as far as I know will work on any version of Windows in use or forseeable in the future.
This doesn't answer my question. Can you please explain the difference between a HANDLE and SOCKET? What are they? Are they exchangable at all times? Are they pointers? To what? Details please! :) -- Carlo Wood <carlo@alinoe.com>

Carlo Wood wrote:
On Tue, Sep 14, 2004 at 03:56:17PM -0500, Aaron W. LaFramboise wrote:
I think it is a flaw to start with select() and wrap it, as I mentioned. Yes I consider ACE big and clunky, and I think it is clunky for this reason.
I definitely do not propose to start with select() and build a demultiplexor around it!
I think we have to approach this from two sides however - not ONLY from the side of the API either. What I am trying to say is that it is possible to discuss things of the core that are so essential that they are NOT related to the final API in the sense that we have a choice depending on what the API is, but instead will influence the API.
As example: Suppose that the core ONLY supported the Proactor pattern, then we have no choice but to choose the proactor pattern and that will directly influence the final API. While if you'd start with the API and would choose for the Reactor pattern, then you'd have to write an inefficient kludge in order to get that API supported with the available core.
The things that I try to discuss are of this kind. I think that the discussion about whether or not we can write the library without threads is essential and not influenced by whatever API you'd pick. It CAN however be of influence to the API.
I omitted detailed description of the 'core multiplexor' concept because it doesn't matter.
I disagree thus.
It is trivial to implement for this interface, or most other reasonable interfaces. In my opinion, this is not a detail. This is the essence of the implementation. The value of a demultiplexor library for Boost will be measured in the efficacy and flexability of its interface, not based on whether it has builtin support for kqueue, for eg.
Agreed. But still - we need to be able to have at least an idea around what kind of core system calls this had to be build. The use or avoidance of threads is an example of great influence. It would be too easy to say "no threads" and design an API that doesn't know about threads, when in the end it turns out that that results in not being able to support multi-process machines in a correct way and for example would lead to a limit of maximal 64 sockets on the windows OS.
As a result of this discussion (about threads) so far, it has become clear that we will HAVE to use IO completion routines (the ports seem to demand threads and we try to avoid an API with explicit thread awareness; forcing the user almost to write MT code) - however, I now read about IO completion routines and it turns out to be a proactor pattern interface... We need to see IF we can reasonably combine IO completion routines (ie, WSARecv()) with your API, before going into details of that API.
Since you've been bringing it up, I feel quite strongly that the core demultiplexor should be based on MsgWaitForMultipleEventsEx in the usual case (when availible). I took Matthew Hurd's comments to heart, and I think another core policy should be based on polling. And another one for IOCP, certainly.
It would be nice if you explained why :)
For UNIX variants, a similar situation precides, with select() probably being the default, despite its flaws.
Only when there is nothing else. select() is the oldest interface and does not scale at all. If poll is available then that is what should be used over select, and when epoll or kqueue is available then obviously those should be used! (Have look at the benchmark given on http://monkey.org/~provos/libevent/libevent-benchmark2.jpg)
Other famous articles about how bad select(2) is are http://wwwatnf.atnf.csiro.au/people/rgooch/linux/docs/io-events.html http://www.kegel.com/c10k.html
(New POSIX AIO is another possibility--the Boost community seems to be suspiciously unconcerned with supporting nonstandard legacy UNIX variants.)
(I am very interested on whether it is possible to meaningfully separate style of core--polling, select-esque--from the actual OS APIs--UNIX, Win32. So far, I think not, but I would like to.)
I am convinced that any API you can come up with can be supported with the available UNIX system calls, with the exception that select() sets a limit on the number of filedescriptors.
Bad well, thats the funny thing - I could code virtually anything on UNIX... If we'd have a windows guru here who can claim the same for windows then I guess could indeed start with the API and forget about the rest for now ;).
Carlo Wood wrote:
One of the problems that can't seem to put a finger on is the fact that windows uses different types for file handles and for sockets.
How can I related HANDLE and SOCKET? And are there any other handle types that I am not aware of? Can SOCKET be casted to a HANDLE just like that? Or even converted without cast?
Well, this is pretty much a trivial implementation detail. I would like to express my frustration for the silliness of the situation involving Win32 names, as in practice they tend to work against encapsulation, not for it.
Boost.Thread avoids all of the silliness of all of this by simply using void * as storage for handles, which as far as I know will work on any version of Windows in use or forseeable in the future.
This doesn't answer my question. Can you please explain the difference between a HANDLE and SOCKET? What are they? Are they exchangable at all times? Are they pointers? To what? Details please! :)
I can't do the issue justice, as there is no simple answer, and I don't know all of the answer anyway. In many cases, we can use a SOCKET where we'd use a HANDLE. Of course, this is subject to what the documentation actually says. If the documentation doesn't say that a particular function works with sockets, then it probably doesn't. The authoritive documentation here is the Platform SDK from Microsoft, which does, in fact, describe all of this. Its downloadable for free, or availible on CD for the cost of S&H, or availible immediately online from MSDN. The Winsock2 section has a bit on what a SOCKET is and how it can be used, and other parts of the PSDK will mention sockets when they are usable in a particular context. Aaron W. LaFramboise

Oops! I did not mean to send that reply yet! Carlo Wood wrote:
As a result of this discussion (about threads) so far, it has become clear that we will HAVE to use IO completion routines (the ports seem to demand threads and we try to avoid an API with explicit thread awareness; forcing the user almost to write MT code) - however, I now read about IO completion routines and it turns out to be a proactor pattern interface... We need to see IF we can reasonably combine IO completion routines (ie, WSARecv()) with your API, before going into details of that API.
OK. However, it is my feeling that, regardless of whatever the ultimate limitations of the implementation are, this should have little affect on my general proposed demultiplexor. The demultiplexor I proposed is only a system-level demultiplexor, not a framework for forcing implemented libraries into submission with some arbitrary paradigm. It is up to dependent components to do the actual submission forcing. :-) Perhaps further details on the sort of library I was proposing may show how this is possible. See below. It seems to me you want to design more than just a demultiplexor. It seems you might want a system that is fully capable of managing the main loop of a sophisticated application, something that you could simply plug a socket class into and suddenly have a high-performance web server. It might manage thread pools for you, and implement a framework for the Apache I/O filtering you mentioned, and a ton of other fun things.
Another important source of experience is libACE. Although libACE provides everything we might need, it doesn't 'connect' well to boost; it has a lot of interfaces that are already done in boost but in another way. Also, I think that libACE is doing more than we need and not everything that we might want. If it were
Let us hope that they will not be saying this about whatever demultiplexor Boost comes up with a few years from now. This is excellent, but I think this is more than a demultiplexor, and I think that these extra things can be separated from the demultiplexor itself, and that this separation is a good thing. So I ask again: what is the scope of "Beyond IOStreams"? Exactly what are the goals and problems to be solved in this discussion? I think there might be some value in presenting a more comprehensive example than the sketchy notes I made availible in the OP. I think I will take some time to make availible some code that demonstrates what I have in mind, as perhaps a prototype to be given consideration along with ACE and asio. Unfortunately I can't simply release code from my prior design, as it is based primarily on runtime polymorphism, not templates. In the meantime, I'd like to keep up with the discussion and see where all of this goes. :-) Aaron W. LaFramboise

On Tue, Sep 14, 2004 at 07:20:02PM -0500, Aaron W. LaFramboise wrote:
It seems to me you want to design more than just a demultiplexor. It seems you might want a system that is fully capable of managing the main loop of a sophisticated application, something that you could simply plug a socket class into and suddenly have a high-performance web server. It might manage thread pools for you, and implement a framework for the Apache I/O filtering you mentioned, and a ton of other fun things.
I hope that was *meant* to be funny, but no; as I said before I want to write the *minimal* interface that you can still call a demultiplexor. This demultiplexor should however not impose any restrictions on what you can write with it. It should also be possible use it for a high-performance web server (after adding a lot of extra code by the user). So, the "high-performance" remains a demand yes. -- Carlo Wood <carlo@alinoe.com>

Hi Aaron, In your wiki article you ask us to forget about the event handling patterns, however I think the choice is fundamental to the design and use of such a demultiplexing API. In your discussion you seem to imply a reactive model (i.e. "tell me when i can take action without blocking") when you talk about things like an on_file_readable event. In developing asio I chose to use the proactive model (i.e. "start an operation and tell me when it's done"), aka asynchronous I/O, because I saw it as providing numerous advantages. One thing I am able to do with a proactive model is provide a simple and consistent way of encapsulating and abstracting complex asynchronous operations. For example, the fundamental mechanism asio provides for asynchronously receiving data on a socket is the async_recv() member function: class stream_socket { //... template <typename Handler> void async_recv(void* buf, size_t max_len, Handler h); //... }; Here the handler is a functor called to indicate completion of the receive operation, when some data has been received on the socket and placed in the buffer. As you know, a socket receive will complete as soon as any data is available, and this will often be less than the maximum length of the buffer. Therefore within asio I compose this fundamental building block to provide a free function async_recv_n(): template <typename Stream, typename Handler> void async_recv_n(Stream& s, void* buf, size_t len, Handler h); which will not call the handler until exactly the specified number of bytes is received, or until an error occurs. A similar pattern could be followed for receiving and decoding user-defined message structures in a single asynchronous operation: template <typename Stream, typename Handler> void async_recv_MyMsg(Stream& s, MyMsg& msg, Handler h); i.e. don't call the handler until an entire message has been received and decoded into the supplied msg object. Another reason I had for choosing the proactive model was that it means asio can use native asynchronous I/O when it is available, as this is typically the most efficient networking API on a given platform. However, asio can use a reactive implementation internally (currently select) if native async I/O is not available, but still present the same async I/O interface to users. Regards, Chris

On Tue, Sep 14, 2004 at 09:42:15PM +1000, Christopher Kohlhoff wrote:
Hi Aaron,
In your wiki article you ask us to forget about the event handling patterns, however I think the choice is fundamental to the design and use of such a demultiplexing API. In your discussion you seem to imply a reactive model (i.e. "tell me when i can take action without blocking") when you talk about things like an on_file_readable event.
In developing asio I chose to use the proactive model (i.e. "start an operation and tell me when it's done"), aka asynchronous I/O, because I saw it as providing numerous advantages.
One thing I am able to do with a proactive model is provide a simple and consistent way of encapsulating and abstracting complex asynchronous operations. For example, the fundamental mechanism asio provides for asynchronously receiving data on a socket is the async_recv() member function:
class stream_socket { //... template <typename Handler> void async_recv(void* buf, size_t max_len, Handler h); //... };
But, this approach is not as flexible as the Reactor model. The reactor model will just read as much as possible from the socket everytime (unless there is no more room in the buffer), allowing a decoder routine to decide how much data to consume when decoding it. No problems here. The proactive pattern however needs to specify upfront how much data it wants to read, otherwise there is no event "I am done". In certain protocols that is simply not possible. Assume a TCP socket over a link that cuts the data in pieces of 200 bytes (small I know, but its just an example). Protocol A: data comes in compressed chunks of 4 kb at a time. Reactive pattern: read(2) is called with a request as big as the buffer (usually big enough), so it will read 200 bytes per call until one has 4 kb in total, at which point decoding can take place. Proactive pattern: async_recv is called with a request for 4 kb. Internally read(2) will be called in mostly the same way as above with the exception of the last packet (4 kb = 4 * 1024 = 4096 = 20 * 200 + 96). But I won't bitch about that unneccessarily cutting into two of a natural packet :p. Protocol B: data comes in binary messages that start with a 4 byte field that contains the total length of the message. Lets assume the buffer is empty and the next message turns out to be 300 bytes. Reactive pattern: read(2) is called with a request as big as the buffer, it will return 200 bytes at first which a higher layer will decode the first 4 bytes of. The next call to read(2) will read again 200 bytes at which point there is enough to decode the first message. Proactive pattern: async_recv is called with a request for 4 bytes. It is not possible to request more because we have no idea how large the message is going to be, and when we request more than the size of the next message, and there wouldn't be more messages for a while, then we'd stall while there *is* something to decode. It calls read(2) with a size of 4 because it doesn't know if there is more room in the buffer. A higher layer will decode these 4 bytes and then call async_recv with a size of 300. Now read(2) is called with a size of 300, only returning 196 bytes though (there was only 200 bytes available and we already did eat 4 bytes from that). Internally it will call read(2) again and when the next packet comes in only consume 104 bytes of that 200 byte packet, leaving the rest again in the socket buffer. At this point the message can be decoded. Slightly more inefficient then with the reactor pattern, but I am still not bitching. Protocol C: A text protocol, the size of the messages are completely unknown - we only know that they end on \r\n at which point we can start to think about decoding it. Reactive pattern: read(2) is called with a request as big as the buffer, it will return chunks of 200 bytes until we find the first EOL sequence, at this point we there is enough to decode the message. Proactive pattern: Huh. How much are we going to read? Two bytes at a time? Or is it possible to tell this pattern: read at MOST 4096 bytes (size of the buffer) but return when read(2) would block and we have at least 1 byte? If that is the way this pattern works, then what is the benefit over the reactor pattern? Because in most cases above, it would just have returned chunks of 200 bytes and the same decoding techniques would have been needed. The only "advantage" that one can possibly think of is that the user can provide a buffer with a given size that DOES meet the need of the current protocol, but that assumes one knows the size of the messages in the protocol. But ok: the user controls the buffer (size). But, is that really an advantage? It is NOT namely when we want to use a stream buffer (boost.IOStreams) that never copies data (libcw and its dbstreambuf). Consider this case (you need a fixed font to view this): .----- contigious message block in stream buffer. / [ <-*-> MESSAGE SO FAR| ] ^ ^__ end of allocated memory |__ end of read data so far <-----------> \__ expected size of total (decodable message) Note that the expected size is necessary or else the proactive pattern loses all its benefits. The expected size goes over the edge of the allocated buffer and we don't want to reallocate it because that means copying the whole buffer. Therefore, we (need to) call read(2) with a requested size such that it precisely fills up this is block. Therefore, the expected size becomes meaningless: in this case we (the user!) would have to call async_recv with a buffer pointer just after the 'R' of "..FAR" and a size that cuts the message into two non-contigious parts anyway. Note that above, under 'Protocol C', we concluded that it is possible that async_recv returns after it only read until 'FAR', even while we requested more, because that is necessary or it won't even be *possible* to decode certain protocols. -- Carlo Wood <carlo@alinoe.com>

Christopher Kohlhoff wrote:
In your wiki article you ask us to forget about the event handling patterns, however I think the choice is fundamental to the design and use of such a demultiplexing API. In your discussion you seem to imply a reactive model (i.e. "tell me when i can take action without blocking") when you talk about things like an on_file_readable event.
In developing asio I chose to use the proactive model (i.e. "start an operation and tell me when it's done"), aka asynchronous I/O, because I saw it as providing numerous advantages.
I agree completely with your points about the merits of the proactive pattern. It is my intention that, unlike most frameworks that lock you into a monolithic demultiplexor, both proactive and reactive patterns will be usable. Its my general feeling that the choice of event handling pattern has a lot less to do with personal preferences, and more to do with what is imposed by conventions of the problem domain, design constraints, managerial decision, and third-party libraries. If a demultiplexor library espoused one particular paradigm, it would be come useless to other incompatible paradigms. In particular, it is up to objects themselves to implement the 'hooks' that their users may use, which might be proactive, reactive, or something else. The basic_demultiplexor itself is constrained to some degree by its implementation and by its minimalism. In the case of on_file_readable, this is necessary rather than a proactive scheme because it doesn't actually know anything about read(), etc. That doesn't stop a socket class from implementing a proactive interface. Aaron W. LaFramboise

Hi Aaron, --- "Aaron W. LaFramboise" <aaronrabiddog51@aaronwl.com> wrote:
It is my intention that, unlike most frameworks that lock you into a monolithic demultiplexor, both proactive and reactive patterns will be usable.
Its my general feeling that the choice of event handling pattern has a lot less to do with personal preferences, and more to do with what is imposed by conventions of the problem domain, design constraints, managerial decision, and third-party libraries. If a demultiplexor library espoused one particular paradigm, it would be come useless to other incompatible paradigms.
Perhaps, but my experience in using asio (as opposed to writing it) has been that: - A proactive interface is all that I have needed for sockets. - Other reactive or even blocking interfaces can generally be efficiently wrapped by proactive ones. Therefore, I have attempted to use an acceptable set of trade-offs that suits most problems, and to my mind a proactive interface fits the bill. But should that not be sufficient, I should point out that the demuxer in asio is extensible using a similar idea to std::locale's facets. So it is possible to override the implementation of a socket, or to even include a entirely new reactive-based resource.
In particular, it is up to objects themselves to implement the 'hooks' that their users may use, which might be proactive, reactive, or something else.
The basic_demultiplexor itself is constrained to some degree by its implementation and by its minimalism. In the case of on_file_readable, this is necessary rather than a proactive scheme because it doesn't actually know anything about read(), etc. That doesn't stop a socket class from implementing a proactive interface.
It is my feeling that attempting to define a demultiplexing interface in such a way is drawing the line of abstraction too low. However perhaps we are talking about different things :) My focus in asio has been on defining a C++ interface for using sockets or other OS objects asynchronously, but while leaving maximum freedom for the implementor (i.e. me until now) to use the best mechanism for a particular platform by default, whether that be native async I/O or a reactor-based solution. The demultiplexor you talk about sounds to me more like an interface I would use to implement a socket (for example) on some platforms, but not necessarily on all. Regards, Chris

Christopher Kohlhoff <chris <at> kohlhoff.com> writes:
--- "Aaron W. LaFramboise" <aaronrabiddog51 <at> aaronwl.com> wrote:
If a demultiplexor library espoused one particular paradigm, it would be come useless to other incompatible paradigms.
Perhaps, but my experience in using asio (as opposed to writing it) has been that:
- A proactive interface is all that I have needed for sockets. - Other reactive or even blocking interfaces can generally be efficiently wrapped by proactive ones.
Thats fine as far as it goes. But I also think Aarons points re many factors affecting the choice of paradigm are equally valid.
But should that not be sufficient, I should point out that the demuxer in asio is extensible using a similar idea to std::locale's facets. So it is possible to override the implementation of a socket, or to even include a entirely new reactive-based resource.
It isn't clear to me how this is better than Aarons separation of the (de)multiplexor and the socket (or whatever) object, with the binding to a demultiplexor being relatively late. A proactive socket could easily enough be written to work with a proactive demultiplexor. The end effect, if I understand Aaron's proposal correctly, is a very flexible and easily extended design, that doesn't in any way preclude anything that your library implements (in fact I would think most of the implementation could be retained?).
In particular, it is up to objects themselves to implement the 'hooks' that their users may use, which might be proactive, reactive, or something else.
It is my feeling that attempting to define a demultiplexing interface in such a way is drawing the line of abstraction too low. However perhaps we are talking about different things :)
I think the demultiplexor (please can we use a better name for this when it is really so generic - maybe just "notifier") must have no idea what it is notifying the handler of if it is to be possible to make this truly generic. The fact that a notifier for proactor/aio may be almost invisible (built into the system) doesn't make all of this irrelevant. The binding of the handler to the notifier is still relevant - if only for cancellation. To implement proactor, there needs to be some way to convey to a handler what it was that just completed. But I don't see that as being any different to the other higher-level notifications Aaron suggest can be implemented by layering. That is, at the lowest layer, the notifier itself simply knows that some wrapped system object has notified. The handler for the system object (eg a signal handler) is then responsible for mapping this to the higher level interface. Because something like an async read request only makes sense for certain objects, it is a member function in Aarons model (if I'm understanding correctly). Something like (very sketchy, generics left out): class aio_rd_req; // encapsulates messy aio request context stuff socket::async_read(char *buf, size_t len, handler h) { aio_read(aio_rd_req::make(this,buf,len,h).iocbp()); } aio_rd_req::handle_completion() { m_handler(m_iocb.aio_buf, aio_return(&m_iocp)); delete this; } Is that so bad or so vastly different to how asio does it? Side note on names: Here asio stands for Australian Security Intelligence Organisation - sort of like the CIA :-)
My focus in asio has been on defining a C++ interface for using sockets or other OS objects asynchronously, but while leaving maximum freedom for the implementor (i.e. me until now) to use the best mechanism for a particular platform by default, whether that be native async I/O or a reactor-based solution.
Do you assume one or the other must be the best on a platform, or do you allow a "mix and match" approach? Platform quirks that result in aio working for some types of objects, and select/poll for others, plus the issue of whether proactive or reactive style makes sense in a given app may mean that the goals of efficieincy and portability fight each other. An approach where the code will fail to compile because there is no specialisation for a particular event source (eg socket) using notifier type X (eg aio) is not a bad thing. The ability to chose between notifiers that are portable but possibly inefficient on some platforms and those that will simply refuse to work if they are not efficeintly implementable is a nice feature. Extending the model to support a wider variety of event sources is bound to introduce further portability issues in terms of which event sources even exist on some platforms, as well as which notifiers work with them. Clearly some uber-notifier allowing a wide mix of event types (this is likely to be quite inefficient even without considering portability) should be provided for in the design, but it shouldn't be the only choice. Regards Darryl.

Hi Darryl, First, I think we are talking at cross purposes and agreeing at the same time :) As far as I understand it, Aaron and I have different goals or approaches: - Aaron wants to define a universal demultiplexor and implement things such as sockets and files in terms of it. - With asio the goal has been to define an interface for sockets (and timers, and in future files or other useful things) that allows you to develop efficient and high performance network apps, and then to provide an efficient high performance implementation behind it. I do not see these as mutually incompatible goals. Indeed, as you say, asio could probably be implemented using a universal demuxer. Where opinions differ, I think, is in what is important at this time. I believe the larger share of the potential target audience for something like asio just wants a "standard" socket interface, and isn't so concerned about having the ability to choose between the demultiplexing patterns. If a portable universal demultiplexor comes out of that process, all the better, but I don't see it as the primary goal. --- Darryl Green <darryl.green@unitab.com.au> wrote:
Thats fine as far as it goes. But I also think Aarons points re many factors affecting the choice of paradigm are equally valid.
I have been mainly concerned with network programming concepts such as sockets, timers and synchronisation so far. For these I believe an async IO (proactive) interface is all you need. In fact, in some environments (e.g. Symbian) an asynchronous interface seems to be all you get. I and others have been following this idea (with asio and also with ACE) and have not been disappointed yet. For other things which might be integrated into an application, then yes you may be forced to adopt a reactive style, but asio doesn't preclude that.
I think the demultiplexor (please can we use a better name for this when it is really so generic - maybe just "notifier") must have no idea what it is notifying the handler of if it is to be possible to make this truly generic.
This is just a naming issue, but i do see demultiplexor as being the most appropriate name, since its major responsibility is to extract events from a limited number of sources and dispatch them to individual handlers.
To implement proactor, there needs to be some way to convey to a handler what it was that just completed.
This information is conveyed by virtue of a particular handler being called. I.e. you supply a different handler for different operations.
That is, at the lowest layer, the notifier itself simply knows that some wrapped system object has notified. The handler for the system object (eg a signal handler) is then responsible for mapping this to the higher level interface. Because something like an async read request only makes sense for certain objects, it is a member function in Aarons model (if I'm understanding correctly). Something like (very sketchy, generics left out):
class aio_rd_req; // encapsulates messy aio request context stuff
socket::async_read(char *buf, size_t len, handler h) { aio_read(aio_rd_req::make(this,buf,len,h).iocbp()); }
aio_rd_req::handle_completion() { m_handler(m_iocb.aio_buf, aio_return(&m_iocp)); delete this; }
Is that so bad or so vastly different to how asio does it?
Yes, this is somewhat similar to how asio works. However, as I said above, it is my opinion that this tends to be too low a place to start defining the abstractions for a socket library.
Side note on names: Here asio stands for Australian Security Intelligence Organisation - sort of like the CIA :-)
I'm an Australian too, and this is in fact one of the reasons why I chose that name ;)
Do you assume one or the other must be the best on a platform, or do you allow a "mix and match" approach? Platform quirks that result in aio working for some types of objects, and select/poll for others, plus the issue of whether proactive or reactive style makes sense in a given app may mean that the goals of efficieincy and portability fight each other. An approach where the code will fail to compile because there is no specialisation for a particular event source (eg socket) using notifier type X (eg aio) is not a bad thing. The ability to chose between notifiers that are portable but possibly inefficient on some platforms and those that will simply refuse to work if they are not efficeintly implementable is a nice feature. Extending the model to support a wider variety of event sources is bound to introduce further portability issues in terms of which event sources even exist on some platforms, as well as which notifiers work with them. Clearly some uber-notifier allowing a wide mix of event types (this is likely to be quite inefficient even without considering portability) should be provided for in the design, but it shouldn't be the only choice.
My goal with asio is that you should not be required to choose between the underlying demultiplexing types in order to be portable. I simply provide a consistent async IO interface on all supported platforms. This is what I mean by drawing a higher line of abstraction -- I am not trying to constrain how this async IO interface is implemented, only stipulating that it is asynchronous. But if you need to customise the demultiplexing type, asio provides a way to do that. As I stated in my previous email, the asio::demuxer uses an extensible facets-like mechanism - I have called them services. So the basic_stream_socket template looks like: template <typename Service> class basic_stream_socket; On construction, the socket object obtains a reference to the corresponding service from the demuxer. So, for the typical use case I provide a typedef: typedef basic_stream_socket<impl_defined> stream_socket; so that most users need not be aware they are even using a template (as in std::string or iostreams). The impl_defined template argument is currently what I consider the best (or only) choice that I have implemented thus far for a particular platform. The service can be changed by using a different a different template argument, e.g. on Win32 you can currently choose between a select implementation or IO completion ports. At the moment I "hide" these implementations in the detail namespace, but with this is where I would see something like Aaron's universal demultiplexor fitting in. But as I have said, I have so far not seen exposing these service implementations as a priority. Regards, Chris

On Tue, 14 Sep 2004 01:21:13 -0500, Aaron W. LaFramboise <aaronrabiddog51@aaronwl.com> wrote:
Well here it is. This is a very sketchy "brain dump" of the general form of demultiplexor I have been referring to. I think that it might be a basis for something eventually acceptable as a general demultiplexor for Boost.
This looks quite similar to what is already in giallo. It adds some things like a dispatch queue that are not currently in giallo. Giallo also places the callbacks and the demultiplexer interface into connection/connector/acceptor classes, providing (IMHO) a clean seperation of responsabilities: the socket class thus wraps the native socket functions into a common interface, providing platform independent error messages. the native select and completion ports are also wrapped as objects, which the proactor and reactor classes use (via a bridge class specified as a policy) to provide a consistent demultiplexer interface. the connection/connector/acceptor manage the state (and lifetime) of the underlying socket, use a demultiplexer (proactor or reactor, interchangeably) and provide the user with completion callbacks.
participants (6)
-
Aaron W. LaFramboise
-
Carlo Wood
-
Christopher Kohlhoff
-
Darryl Green
-
Hugo Duncan
-
Tommy