
Rob Stewart wrote:
From: "Jonathan Turkanis" <technews@kangaroologic.com>
If the library is well-received, however, I expect eventually there will be a large body of user-defined filters, and I wouldn't want them to have to be rewritten.
Given the filters written with sync I/O in mind are quite different from those written for async I/O, is that really an issue?
Do you mean that different filtering operations will be desired, or that the implementation will be different? My hope was to make the implementation of non-blocking filters easy enough that I could require all filters to be non-blocking. Maybe my original message was unclear on this point.
(This is another topic that didn't get much attention during review. There are some, notably James Kanze, who insist that well-designed stream buffers should not throw exceptions. My defense of exceptions is here: http://tinyurl.com/59xv6. At the time of the review, I was prepared to switch to error codes, but it's a bit late now.)
I apparently missed that during the review. On what basis does James Kanze make his assertion?
See this message by James Kanze, my reply, his reply and my further reply: http://tinyurl.com/4tkj3. I don't find his argument very convincing on this point. But he is very knowledgable, so I'm reluctant to dismiss his view.
So long as EOF and EAGAIN do not result in exceptions,
right
given how commonly they occur, I have no problems with exceptions. However, it sounds like you might be throwing an exception to indicate EOF. If so, I don't like it.
How did I give you that impression?
It is not entirely uncommon to read to EOF, do something to a file, and continue reading. For example, we have certain files that we tail throughout the day in our production apps. EOF is a common occurence because we quickly process the new data appended to the files and reach EOF again. That is, the sink is faster than the source. This behavior would be complicated by EOF throwing an exception. An error indication of EOF, plus the ability to clear that condition and try again, fits this usage better. (Granted, that use of a file is not widespread, but since you offer no choice, such usage is not possible.) You mentioned in the cited documentation section that you didn't want to complicate things such that users would wonder which of the various ways a given function indicates errors. I sympathize, but isn't that what you're introducing with your basic_character idea?
I'm not treating EOF or EAGAIN as errors. What I'm trying to avoid is having a single error (or any other type of condition, for that matter) represented in more than one way.
How about putting a state variable in filters, sources, and sinks to indicate EAGAIN, EOF, and GOOD. Then, functions that only read or write a single character can return a bool, and functions that read or write multiple characters can return a count. If the bool is false or the count is zero, the code can query the state to determine whether EAGAIN or EOF occurred. That keeps the error indication in the return value simple and even leaves room for adding additional error states should that prove necessary. Hard errors can still be signaled via exception.
This was one of the options I seriously considered. I believe the very first version of the library I posted had this feature. I tend to think it would make life harder for people writing filters and devices to force them to implement an additional function. As things stand, you can often get by with a implementing a single function, and it works pretty well. Also, it makes it harder to fit standard streams and stream bufferss into the framework: stream buffers don't have a state indicator at all; streams do, but it doesn't map well to good/eof/eagain. I'm willing to be convinced otherwise, however.
V. Problems ----------------------------
1. Harder to learn. Currently the function get and the concept InputFilter are very easy to explain. I'm afraid having to understand the basic_character template before learning these functions will discourage people from using the library.
The class template is hardly complicated. I can't imagine it would be a show stopper, though it does add some complexity.
I'm thinking of decribing it as a replacement for traits_type::int_type. Compared to using int_type and its family of helper functions (eq_int_type, eof., not_eof, ...) basic_character is a snap. ;-)
That's a good way to make it palatable to those already familiar with writing streambufs, but it hardly applies to the majority of your library audience, does it.
The current interface has get() return an instance of int_type, so if I keep the interface how it is, I have to say something in the docs about char_traits. But you have a good point.
2. Harder to use. Having to check for eof and fail make writing simple filters, like the above, slightly harder. I'm worried that the effect on more complex filters may be even worse. This applies not just to get, but to the other functions as well, since their returns values will require more careful examination.
That's a real issue.
It's become even more clear in this thread, given my botched uncommenting_filter implementation.
Keep running notes on the things one can forget or do wrong so your documentation can warn about them.
Good idea. I'll definitely do this; the question is whether it will be in the advanced section or will apply to all filters and devices.
(The member functions good(), eof(), and fail() on basic_character could be made non-member functions and could be implemented for char and wchar_t.
I forgot to say that I like the idea of making them non-members. All I am unsure about is whether they should be implemented for char and wchar_t (actually for int and std::char_traits<wchar_t>::int_type)
That would help code deal with synchronous code in the asynchronous style.)
I guess I could implement
bool eof(int n) { return n == EOF; } bool good(int n) { return n != EOF; }
bool weof(std::char_traits<wchar_t>::int_type n) { return n == WEOF; } bool wgood(std::char_traits<wchar_t>::int_type n) { return n != WEOF; }
I tend to think that n == EOF and n == WEOF are more readable, however.
Yes, they are more readable, but if you don't document the implementation of those functions, library users will expect to use them in all situations. Thus, switching to async from sync I/O will be less of a stylistic change. (That also leaves room for you to change the implementation details if need be.)
Let me make sure we're on the same page: We're assuming I decide to provide separate concepts for non-blocking i/o, and that some of the non-blocking concepts use basic_character. Are you saying that someone who is used to writing non-blocking filters and tries to write a blocking filter may get confused because eof() doesn't work with int? Maybe I could have get() return a basic_character even in the blocking case, and explain that there's no need to test for eagain in blocking i/o. This would eliminate one of the last traces of char_traits in the public interface of the library, which would be good. The other reasonable way to eliminate char_traits would be to have blocking get() return an optional<char> (see my reply to Thorsten); but if I'm using basic_character for non-blocking i/o, it may make sense to use it in both places.
Of course, if you adopt my state idea, then the return value would simply indicate that things went well or something less than ideal happened. In the latter case, one would query the state of the filter/device to learn what happened. That would obviate your basic_character class and these functions.
One problem with code that returns status information is that folks can forget to check the status.
This is a real problem with eagain, since testing may not reveal the error unless eagain happens to occur at the right place in a character sequence.
You could return a special type in lieu of bool, and basic_character in lieu of char_type, which require inspecting whether they indicate an error condition. If that query is not done, the destructor can complain.
Could you elaborate? It sounds like it could lead to very poor performance.
OTOH, you could just say, "don't do that!"
Best Regards, Jonathan