[Iostreams] Buffering nonblocking I/O
I find myself wanting to extend Boost.Iostreams to support (at least a particular use case for) nonblocking I/O. I'd like to sketch a couple concepts, hoping you will improve and generalize the notion. In particular, it's easily possible that I don't yet understand Iostreams well enough: what I'm describing may well fit into existing Concepts better than I realize. Boost.Iostreams intends to provide more complete support for nonblocking I/O in future [0]. We can hope that a discussion around these ideas could help evolve the library in that direction. First I want to acknowledge Alexander Nasonov's work from 2003 [1]. I haven't used or even looked at his "iostream-like pipes" library [2] because I'm dubious about registering with Yahoo! before being allowed to download it. (John Torjo seems to have had some trouble too. [3]) If it were posted somewhere more freely accessible, I'd be interested to examine it because I think there's at least conceptual overlap. Let's postulate a new Boost.Iostreams Concept to describe an object capable of buffering data between producer and consumer. It's not quite a Filter because, as I understand it, control is passed to a Filter at only one end. An InputFilter receives control at the consumer end; it forwards the call to its upstream Source. Conversely, an OutputFilter receives control at the producer end, forwarding the call to its downstream Sink. I'm talking about an object that receives control at both the producer and consumer end, providing a buffer between them. My first thought was to call this concept a Pipe because it bears a strong conceptual resemblance to OS pipes. But others in the aforementioned mail thread ([4], [5]) point out that using the word "pipe" in connection with Iostreams (or C++ streams in general) is too easily misinterpreted: one tends to assume a stream interface to a real OS pipe. Other plausible names are "buffer" or "synchronization channel" [4] or "message queue." Talking about a "buffer stream" versus a "streambuf" could fairly quickly get confusing. I shy away from any form of "channel" simply because, in our own code base, the word "channel" is already overloaded to mean four or five vastly different things. Of course "message queue" already has another specific meaning too. [6] Until we can settle on a better name, as a placeholder, let's just call this a Queue. Naturally the producer end should model Sink, and the consumer end should model Source. Why elevate this to a concept? Why not just provide a class? Because different use cases suggest different implementations -- all of which could be plug-compatible. To the extent I understand it, Alexander's library [1] appears to be targeted at cross-thread message passing. It implicitly handles thread synchronization. He also mentions unlimited vs. limited capacity, that is, a bounded max size: when the buffer is full, a producer will block until a consumer has eaten some of the pending data. Unless I went off the rails somewhere, this would be a valid model of the Queue concept. My use case targets an interactive program built around an event loop. I'm not passing data between different threads, so thread synchronization would be unnecessary overhead. Instead, every iteration of the main loop I intend to poll a nonblocking source and a nonblocking sink. I want to use a couple such Queue objects to buffer the data. The application logic must be able to write data to a std::ostream. (If the producer end of a Queue models Sink, it should be straightforward to construct such an ostream based on this Queue.) Because the Sink underlying an ostream operation must model Blocking, I actually need an unbounded Queue for this. Then each iteration of the main loop will attempt to write all pending data on the nonblocking sink; data actually written can be consumed from the Queue. Unwritten data will be handled by subsequent iterations. The input side is similar. Each iteration of the main loop will attempt to fill a temporary buffer from the nonblocking source; the data it obtains will be put into the other Queue. (This can be a bounded Queue: given a way to determine how much buffer remains available, we can limit the size of the nonblocking read.) Presently there will be "enough" data for the application logic to read an istream attached to the consumer side of the Queue. This is so we can guarantee that the application-level istream read operation can be completely satisfied without blocking. Determination of "enough" is of course protocol-specific, as is the means for notifying the application logic. I would want such details to belong to a specific instance of the concept, perhaps a subclass of a provided base class, rather than part of the Queue concept itself. I've been speaking of a nonblocking "source" and "sink" (in lowercase) because, for me, these are actually APR I/O functions. But it occurs to me that we could provide adapters to existing Iostreams Source and Sink objects, using the Iostreams conventions for nonblocking I/O. One need only arrange to poll the adapters periodically. It would also be possible to implement a Queue based on an actual OS pipe. Does this avenue seem worth pursuing? I'm going to build something like this anyway, but (a) I hope you can help me improve the abstractions and (b) it may be worth trying to Boostify the code for possible contribution. [0] http://www.boost.org/doc/libs/1_48_0/libs/iostreams/doc/guide/asynchronous.h... [1] http://lists.boost.org/Archives/boost/2003/08/51289.php [2] http://groups.yahoo.com/group/boost/files/pipes.zip [3] http://lists.boost.org/Archives/boost/2003/08/51310.php [4] http://lists.boost.org/Archives/boost/2003/08/51405.php [5] http://lists.boost.org/Archives/boost/2003/08/51448.php [6] http://www.amqp.org/
participants (1)
-
Nat Linden