
Sean Kelly <sean@ffwd.cx> writes:
Jeremy Maitin-Shepard wrote:
It seems that network sockets are the most common use case for nonblocking I/O multiplexing (i.e. the reactor pattern). If a library restricts itself to those alone, an efficient portable implementation may be possible.
I would say restrict it to socket and file i/o and expect that it would be used almost exclusively for socket i/o.
As far as I know, it is not possible to poll file descriptors (HANDLEs) using the Windows API, either through something like WSAAsyncSelect (i.e. sending messages to a Windows message queue), WSAEventSelect (waiting on an event, which is not a great solution since it requires use of WFMO, which is limited to 64 events). This can't even be solved by using multiple threads --- I believe that polling is simply not supported on Windows. Asynchronous completion of socket read operations and file read operations can indeed be unified, but I would not suggest that a boost library not implement polling.
On the other hand, I believe it would be possible to create a portable library which utilized either WSAAsyncSelect or select on Windows platforms, kqueue on BSD platforms, epoll on Linux kernels that support it, and poll on other POSIX platforms. (On Solaris, I believe it would be possible to utilize /dev/poll.)
I'd like to focus on completion routines or methods that could mimic them. IOCP in Windows, /dev/poll in Solaris and either /dev/poll or epoll in Linux, perhaps kqueue in BSD? Would all of those serve? My experience is mostly with IOCP but there's a Dr. Dobbs article this month that uses /dev/poll in the same model.
AFAIK, neither /dev/poll nor epoll provide similar functionality to completion ports. /dev/poll, epoll and kqueue poll a file descriptor for _available_ data to read, _available_ buffer space to write, _available_ out of band data to read, in the case of a connecting socket, a socket which has connected, and in the case of a listening socket, an _available_ new connection to accept. Windows completion ports, in contrast, allow you to determine when a particular write operation or read operation has been completed. Thus, this is a fundamentally different model. Windows completion ports are similar to using the aio_* family of functions on POSIX platforms, but on POSIX platforms, it is generally more convenient and efficient to use polling. The nature of the aio_* functions is that they do not efficient scale to waiting on a large number of concurrent operations. Specifically, it often necessary to check the status of each operation after the aio_suspent function returns. Systems like epoll, /dev/poll, and kqueue scale very well to a large number of file descriptors. On Windows platforms, it seems that systems like WSAAsyncSelect, WSAEventSelect, select, and WFMO do not scale very well to a large number of socket descriptors/handles. The fact that Windows only scales efficiently using one model, while UNIX platforms only scale efficiently using the other model, is an added and particularly problematic issue in writing a portable library. Avoiding this would require a very substantial amount of abstraction, and I would argue that mandating that layer of abstraction on all users would make the library less flexible to certain tasks.
[snip]
Windows has asynch listen but it's kind of annoying to use. I've always used one or more threads for the task--overhead is minimal since they're basically always blocking. Beyond that, I think the i/o layer should be a thread pool that does i/o processing exclusively. Since it's a thread pool some synchronization is already required and extending execution into the rest of the application code seems like an invitatation to trouble.
I believe it would be possible to create a portable library which used either asynchronous operations or polling, and automatically accepted all incoming connections, read all available data, and provided a send buffer which would get sent automatically. This could all be done using only a single thread. The user could specify certain limits on the number of connections to accept, and the size of the read and write buffers. I do not think it is possible to create a significantly lower level interface which is also portable and efficient.
[snip]
-- Jeremy Maitin-Shepard