New subject: IOStreams Formal review -- Guide for Reviewers

7 Sep 2004

      First, let me apologize for not being able to review the actual code 
(yet).  The interface and correctness/performance of implementation are 
all I really care about for now :)
...
There are some special cases where the copying is wasteful, though. For
instance:
1. If filter1 is just a passive observer, simply counting the number of
occurences of '\n', or copying the data to a logging stream, then the end user
should really be writing directly to buf2. Filter1 could process the data using
the same buffer as filter2.
2. Same as above, but filter1 also modifies the data in-place, making
character-by-character modifications. (E.g., a toupper filter). This can be
handled the same way.
3. If 'resource' is a stream or stream buffer, it could be assumed to do its own
buffering. In that case, filter2 should be writing directly to resource, instead
of to streambuf3:
filter2.write(resource, buf2, buf2+ n2).
These three cases can be handled easily by modifying the existing framework. I
didn't add special treatment because it occurred to me rather late in
development.
I'd certainly feel a bit more proud of the library if it handled these 
cases (1 and 3 seem most important).  It seems well worth a few days' delay.
...
There is another class of cases in which the current setup is wasteful, but I
think it is rather domain-specific:
4. Most filters in the chain modify only small parts of character sequences,
leaving big chunks unchanged.
Well, basic_newline_filter would do this - at least when replacing CRLF 
with a single '\n' character.
...
To optimize case 4 would require a major library extension. My feeling is that
it is not necessary at this point, but I'd like to know what others think.
It should certainly wait if you don't have a design already in mind.  
The library is good enough without this.
...
-----------------------------------------------------
Part III: Interface questions:
1. How to handle read and write requests which return fewer characters than
requested, though there has been no error, and EOF has not been reached. I think
some answer to this question is necessary to allow the library to be extended
later to handle models orther than ordinary blocking i/o. I mention three
possibilties here, http://tinyurl.com/6r8p2, but only two are realistic. I'm
interested to know how important pople think this issues is, and what is the
best way to resolve it.
I think #2 would be most in line with what people are used to under 
Posix (-1/EAGAIN).  Blocking (option 1) actually doesn't make any sense 
at all except at the ends (source/sink), unless you put each filter in a 
chain into its own thread and use something like semaphores. 

I suppose the idea is, if you're in the middle of a filter chain, and 
somebody gives you some input which would overflow your buffer, you 
attempt to empty out your buffer to the next guy, but if he can't take 
enough of it to allow you to accept the whole input (or even, none of 
it), you have to tell the guy who sent it to you you can't take it, and 
he has to hold onto it in his buffer.  That seems reasonable.

I might in fact prefer that my source or sink resource act like #1 
(block until at least one character or EOF), and I assume this would be 
the default behavior if I open it in the default, blocking mode, but it 
isn't possible to have filters act that way; they need to pass the 
"can't take your data" feedback all the way back through the stack to 
the end user, who then needs to hold onto it and select/spin/whatever on 
the underlying sink resource ...
...
2. The stack interface. Is the interface to the underlying filter chains rich
enoguh? Originally is was similar to std::list, so that you could disconnect
chains at arbitrary points, store them, and reattach them later. I decided there
wasn't much use for this, so I simplified the interface.
I'm sure someone, somewhere, sometime will want to perform 
splices/appends on filter chains, but I can't imagine why, either.  At 
least, keep the simple interface and put the more complicated, flexible 
one in the appendix.
...
3. Exceptions. James Kanze has argued repeatedly that protected stream buffer
functions should not throw exceptions (http://tinyurl.com/5o34x). I try to make
the case for exceptions here: http://tinyurl.com/6r8p2. What do people think?
I sympathize with both arguments; either way seems fine to me.  There is 
no real performance penalty for an exception that is thrown at most once 
per stream (EOF), but he's right that the existing interface (which end 
users never see) seems to specify a return value in that case.  But, as 
you say, if you want to support async IO, it's moot - you do have to 
return the number of characters successfully read/written, so you need 
the std::streamsize return type, so ... you may as well return EOF 
instead of throwing it.  That is, I see no reason to throw an EOF value 
if you support async IO (which I think would be lovely).
...
So the question is: Should an open() function be added to the closable
interface, to eliminate the need for first-time switches? Alternatively, should
there be a separate Openable concept?
Without a doubt, an Openable concept.  If you add open() to Closeable 
you'd really want to change the concept name ;)  For example, if I 
implement a first-time flag (no real hardship), I'll have to remember to 
add the first-time test not only when data is processed, but also when 
the stream is closed (need to handle empty input properly).  I suspect 
people will forget this at least once.  There's also a minor performance 
gain: the interface would be called when the stream is initialized, I 
assume, and not require a first-time flag check with each use.  
Admittedly, inconsequential.

-Jonathan Graehl

Re: [boost] IOStreams Formal review -- Guide for Reviewers

Jonathan Graehl

Jonathan Turkanis

Jonathan Graehl

Jonathan Turkanis

Jonathan Graehl

Jonathan Turkanis

Jonathan Graehl

tags

participants (2)