[boost] Re: Re: [iostreams] Major interace changes planned --commentsrequested

6 Mar 2005

      Rob Stewart wrote:
...
From: "Jonathan Turkanis" <technews@kangaroologic.com>
...
...
If the library is well-received, however, I expect eventually there
will be a large body of user-defined filters, and I wouldn't want
them to have to be rewritten.
Given the filters written with sync I/O in mind are quite
different from those written for async I/O, is that really an
issue?
Do you mean that different filtering operations will be desired, or that the
implementation will be different?

My hope was to make the implementation of non-blocking filters easy enough that
I could require all filters to be non-blocking. Maybe my original message was
unclear on this point.
...
...
(This is another topic that didn't get much attention during review.
There are some, notably James Kanze, who insist that well-designed
stream buffers should not throw exceptions. My defense of exceptions
is here: http://tinyurl.com/59xv6. At the time of the review, I was
prepared to switch to error codes, but it's a bit late now.)
I apparently missed that during the review.  On what basis does
James Kanze make his assertion?
See this message by James Kanze, my reply, his reply and my further reply:
http://tinyurl.com/4tkj3. I don't find his argument very convincing on this
point. But he is very knowledgable, so I'm reluctant to dismiss his view.
...
So long as EOF and EAGAIN do not
result in exceptions,
right
...
given how commonly they occur, I have no
problems with exceptions.  However, it sounds like you might be
throwing an exception to indicate EOF.  If so, I don't like it.
How did I give you that impression?
...
It is not entirely uncommon to read to EOF, do something to a
file, and continue reading.  For example, we have certain files
that we tail throughout the day in our production apps.  EOF is a
common occurence because we quickly process the new data appended
to the files and reach EOF again.  That is, the sink is faster
than the source.  This behavior would be complicated by EOF
throwing an exception.  An error indication of EOF, plus the
ability to clear that condition and try again, fits this usage
better.  (Granted, that use of a file is not widespread, but
since you offer no choice, such usage is not possible.)
You mentioned in the cited documentation section that you didn't
want to complicate things such that users would wonder which of
the various ways a given function indicates errors.  I
sympathize, but isn't that what you're introducing with your
basic_character idea?
I'm not treating EOF or EAGAIN as errors. What I'm trying to avoid is having a
single error (or any other type of condition, for that matter) represented in
more than one way.
...
How about putting a state variable in filters, sources, and sinks
to indicate EAGAIN, EOF, and GOOD.  Then, functions that only
read or write a single character can return a bool, and functions
that read or write multiple characters can return a count.  If
the bool is false or the count is zero, the code can query the
state to determine whether EAGAIN or EOF occurred.  That keeps
the error indication in the return value simple and even leaves
room for adding additional error states should that prove
necessary.  Hard errors can still be signaled via exception.
This was one of the options I seriously considered. I believe the very first
version of the library I posted had this feature. I tend to think it would make
life harder for people writing filters and devices to force them to implement an
additional function. As things stand, you can often get by with a implementing a
single function, and it works pretty well. Also, it makes it harder to fit
standard streams and stream bufferss into the framework: stream buffers don't
have a state indicator at all; streams do, but it doesn't map well to
good/eof/eagain.

I'm willing to be convinced otherwise, however.
...
...
...
...
V. Problems ----------------------------
1. Harder to learn. Currently the function get and the concept
InputFilter are very easy to explain. I'm afraid having to
understand the basic_character template before learning these
functions will discourage people from using the library.
The class template is hardly complicated.  I can't imagine it
would be a show stopper, though it does add some complexity.
I'm thinking of decribing it as a replacement for
traits_type::int_type. Compared to using int_type and its family of
helper functions (eq_int_type, eof., not_eof, ...) basic_character
is a snap. ;-)
That's a good way to make it palatable to those already familiar
with writing streambufs, but it hardly applies to the majority of
your library audience, does it.
The current interface has get() return an instance of int_type, so if I keep the
interface how it is, I have to say something in the docs about char_traits. But
you have a good point.
...
...
...
...
2. Harder to use. Having to check for eof and fail make writing
simple filters, like the above, slightly harder. I'm worried that
the effect on more complex filters may be even worse. This applies
not just to get, but to the other functions as well, since their
returns values will require more careful examination.
That's a real issue.
It's become even more clear in this thread, given my botched
uncommenting_filter implementation.
Keep running notes on the things one can forget or do wrong so
your documentation can warn about them.
Good idea. I'll definitely do this; the question is whether it will be in the
advanced section or will apply to all filters and devices.
...
...
...
(The member functions good(), eof(), and fail() on
basic_character could be made non-member functions and could be
implemented for char and wchar_t.
I forgot to say that I like the idea of making them non-members. All I am unsure
about is whether they should be implemented  for char and wchar_t (actually for
int and std::char_traits<wchar_t>::int_type)
...
...
...
That would help code deal with
synchronous code in the asynchronous style.)
I guess I could implement
bool eof(int n) { return n == EOF; }
     bool good(int n) { return n != EOF; }
bool weof(std::char_traits<wchar_t>::int_type n)
    { return n == WEOF; }
     bool wgood(std::char_traits<wchar_t>::int_type n)
    { return n != WEOF; }
I tend to think that n == EOF and n == WEOF are more readable,
however.
Yes, they are more readable, but if you don't document the
implementation of those functions, library users will expect to
use them in all situations.  Thus, switching to async from sync
I/O will be less of a stylistic change.  (That also leaves room
for you to change the implementation details if need be.)
Let me make sure we're on the same page: We're assuming I decide to provide
separate concepts for non-blocking i/o, and that some of the non-blocking
concepts use basic_character. Are you saying that someone who is used to writing
non-blocking filters and tries to write a blocking filter may get confused
because eof() doesn't work with int?

Maybe I could have get() return a basic_character even in the blocking case, and
explain that there's no need to test for eagain in blocking i/o. This would
eliminate one of the last traces of char_traits in the public interface of the
library, which would be good. The other reasonable way to eliminate char_traits
would be to have blocking get() return an optional<char> (see my reply to
Thorsten); but if I'm using basic_character for non-blocking i/o, it may make
sense to use it in both places.
...
Of course, if you adopt my state idea, then the return value
would simply indicate that things went well or something less
than ideal happened.  In the latter case, one would query the
state of the filter/device to learn what happened.  That would
obviate your basic_character class and these functions.
...
One problem with code that returns status information is that
folks can forget to check the status.
This is a real problem with eagain, since testing may not reveal the error
unless eagain happens to occur at the right place in a character sequence.
...
You could return a special
type in lieu of bool, and basic_character in lieu of char_type,
which require inspecting whether they indicate an error
condition.  If that query is not done, the destructor can
complain.
Could you elaborate? It sounds like it could lead to very poor performance.
...
OTOH, you could just say, "don't do that!"
Best Regards,
Jonathan