[asio][filesystem] system error handling

Hello all, I have been thinking about how to reconcile Boost.Asio error handling with the system error classes defined in the TR2 filesystem proposal N1975. I have some concerns which I'll outline in the context of how I plan to use them. The current plan is as follows: - Synchronous functions would by default throw a system_error exception. - An overload of each synchronous function would take an error_code& as the final argument. On failure these functions set the error_code to the appropriate error. On success, the error_code is set to represent the no error state. These functions do not throw exceptions. - Callback handlers for asynchronous functions would take an error_code as the first parameter. For example: void handle_read(error_code ec) { ... } - Since many error codes in socket programming are "well-known" and often tested for explicitly, they need to be defined somewhere. The current errors in the asio::error::code_type enum would be replaced by constants of type error_code: namespace boost { namespace asio { namespace error { const error_code access_denied = implementation_defined; const error_code message_size = implementation_defined; ... } // namespace error } // namespace asio } // namespace boost (Note that in practice boost::asio::error might actually be a class with static members rather than a namespace). The issues I have run into so far are: - The error_code class seems to assume that there is a single "namespace" for the system_error_type values. This is not necessarily the case for the errors in socket programming. For example on UNIX most socket functions use errno values, but the netdb functions (gethostbyname and friends) use a different set of error values starting from 1, and the getaddrinfo/getnameinfo functions use another set of error values starting from 1. Would it be possible for the error_code class to have a category value (which is probably an integer with implementation-defined values) to allow the creation of unique error_code values for each type of system error? A category value would also allow implementors and users to extend the system error classes for other sources of system error (e.g. SSL errors, application-specific errors, etc). - A common idiom in code that uses asio, when you don't care about a specific error type, is to simply treat the error as a bool: void handle_read(asio::error e) { if (e) { // take action because of error } else { // success } } Can the error_code class be made convertible-to-bool and also have operator! to support this style? - Are these classes available in CVS somewhere so they can be reused by other boost libraries? Cheers, Chris

Chris,
I have been thinking about how to reconcile Boost.Asio error handling with the system error classes defined in the TR2 filesystem proposal N1975. I have some concerns which I'll outline in the context of how I plan to use them.
The current plan is as follows:
- Synchronous functions would by default throw a system_error exception.
- An overload of each synchronous function would take an error_code& as the final argument. On failure these functions set the error_code to the appropriate error. On success, the error_code is set to represent the no error state. These functions do not throw exceptions.
Here's a decription of the approach I've taken when designing a portable system library: There were several problems that I was trying to solve: a) I was trying to minimize the amount of boilerplate error handling code which gets written, or even worse, which doesn't get written, because it's so boring to write it! b) Unify the error reporting infrastructure under Unix and Windows based systems. c) Make it easy to handle different classes of errors. An example of this would be the Unix read() call: How many times have you seen people write while (...) { ssize_t res = ::read(...); if (!res) { switch (errno) { case EINTR: continue; case ... } } } So wrt errors, we would like some types of errors to (usually) automatically restart the operation that was attempted, while other "exceptional cases" (like EOF) are not critical probably should not cause an exception, while others should just be handled as critical errors (and e.g. throw) The other problem was that Windows and Unix have a different models of handling errors. On Unix, errors are "syscall sensitive", i.e. there is a limited number of error codes and they sometimes mean slightly different things depending on which system call was invoked. On the other hand, on Windows the error space is "flat" and any system call can essentially return any of the (10000 or so) error values. My design was as follows: 1) I also use an extra parameter, SysResult &, but also provided a second parameter called ErrorHandler. This was a stateless class which would define the behaviour in case of an error. In our programs, I've observed that in most places return codes would get handled in a very similar fashion, so we required only a few different variations of ErrorHandlers. 2) Upon a invocation, the function call would get executed and do its thing. If there was an error, the error handler is to be assigned several values: a) The "library" error code. This was a high level translation of the common error codes into a system independent enumeration (which, btw, was using a "smart enum" class, which would have an associated error string with each value) b) The system error code. This was the error code of the last function that caused the problem (e.g. errno) c) A string that would describe the error in more detail. IMHO it is very important for a system to provide accurate and leggible errors which make it easy to diagnose the source of the problem even if it propagated deep from within some library (instead of just returning "EACESS". What am I supposed to do if the system just logs "EACESS"?) This would be a concatenation of the "Library" error code string, the system error code string (e.g. as returned by strerror() or FormatMessage()) and any other contextual information (e.g. The name of the function called, what was the function trying to do and a suggestion what could be the cause of the problem) This is because for example when mapping a file, one has to go through several system calls to return a memory mapped file object, and one would like to know exactly in what part of that function the call failed. d) The error handler would then get invoked with the SysResult and it could decide based either on the "System independent" or "system specific" error code, what it should do. The error handler would return one of "CONTINUE", "RETRY", "FAIL" enumerations. The first and second are, I think, self-explanatory. The third would cause the surrounding code to trow an exception, similar to system_error. My default error handlers would do something sensible like retry on EINTR and and throw on other error conditions. I also had a non-throwing version. Assuming there was no exception, it was easy to check the result of an operation, just like you describe. This system, while far from perfect, seems to work quite well. It is not without fault though - here are some problems (some of which, it seems, are shared by your proposal above): i) Assuming that your "high-level" system operation is implemented in terms of several system calls, what is a good way handling system-specific error codes? My error handlers had the possibility of handling not only the "high- level" library errors but also use the low-level system errnos. However, as you can see, this breaks encapsulation by making assumptions about how the function is implemented in terms of the underlying syscalls. ii) The same issue I described above applies equaly well to other functions implemented in terms of several "high-level" (library) system calls. To be able to use the SysResult effectively, you would need to know which function generated it, again breaking encapsulation. What do you think? Tom

Hi Tomas, Thanks for your reply. One thing I forgot to mention is that this is also in the context of writing a proposal for TR2 based on Boost.Asio. Since the filesystem library has already been accepted for TR2 I want to keep the low-level error approach as consistent as possible with it. <snip>
The other problem was that Windows and Unix have a different models of handling errors. On Unix, errors are "syscall sensitive", i.e. there is a limited number of error codes and they sometimes mean slightly different things depending on which system call was invoked. On the other hand, on Windows the error space is "flat" and any system call can essentially return any of the (10000 or so) error values.
Yep, this is exactly the main problem I'm facing. Even on windows the error space may not be truly flat once you incorporate other "system" libraries like OpenSSL say, which use their own error space. <snip>
d) The error handler would then get invoked with the SysResult and it could decide based either on the "System independent" or "system specific" error code, what it should do. The error handler would return one of "CONTINUE", "RETRY", "FAIL" enumerations. The first and second are, I think, self-explanatory. The third would cause the surrounding code to trow an exception, similar to system_error. My default error handlers would do something sensible like retry on EINTR and and throw on other error conditions. I also had a non-throwing version.
This approach is similar to what's currently in asio: - The synchronous functions have overloads that take an Error_Handler function object that can be used to customise what happens when an error occurs. However the Error_Handler does not allow the operation to be restarted. - The higher level asio::read() and asio::write() functions also have a Completion_Condition function object which is passed the error code and the amount of bytes transferred. The return value from this function object indicates whether the underlying operation should be restarted. <snip>
i) Assuming that your "high-level" system operation is implemented in terms of several system calls, what is a good way handling system-specific error codes? My error handlers had the possibility of handling not only the "high- level" library errors but also use the low-level system errnos. However, as you can see, this breaks encapsulation by making assumptions about how the function is implemented in terms of the underlying syscalls.
For most functions in the current asio implementation I think this isn't too bad. They are often relatively thin wrappers around the existing system calls, and in many cases there is already mapping to just one system call. In other cases I think it will have to be a best-effort translation of the error code to something sensible.
ii) The same issue I described above applies equaly well to other functions implemented in terms of several "high-level" (library) system calls. To be able to use the SysResult effectively, you would need to know which function generated it, again breaking encapsulation.
I think the idea with the error_code/system_error approach is to leave the error_code object with very little other than the system error number. However in the case of an exception, the system_error's "what" string can contain more information about the context where the error was generated. Cheers, Chris

"Christopher Kohlhoff" <chris@kohlhoff.com> wrote in message news:20060522134258.91630.qmail@web32609.mail.mud.yahoo.com...
Hello all,
I have been thinking about how to reconcile Boost.Asio error handling with the system error classes defined in the TR2 filesystem proposal N1975. I have some concerns which I'll outline in the context of how I plan to use them.
The current plan is as follows:
- Synchronous functions would by default throw a system_error exception.
- An overload of each synchronous function would take an error_code& as the final argument. On failure these functions set the error_code to the appropriate error. On success, the error_code is set to represent the no error state. These functions do not throw exceptions.
So far, so good.
- Callback handlers for asynchronous functions would take an error_code as the first parameter. For example:
void handle_read(error_code ec) { ... }
That makes sense to me. It wouldn't be useful to throw an exception because of the asynchronous nature of the control flow, so supplying a error_code to the callback gives it a chance to deal with the error, or ignore it if desired. Was that your analysis?
- Since many error codes in socket programming are "well-known" and often tested for explicitly, they need to be defined somewhere. The current errors in the asio::error::code_type enum would be replaced by constants of type error_code:
namespace boost { namespace asio { namespace error {
const error_code access_denied = implementation_defined; const error_code message_size = implementation_defined; ...
} // namespace error } // namespace asio } // namespace boost
(Note that in practice boost::asio::error might actually be a class with static members rather than a namespace).
The issues I have run into so far are:
- The error_code class seems to assume that there is a single "namespace" for the system_error_type values. This is not necessarily the case for the errors in socket programming.
For example on UNIX most socket functions use errno values, but the netdb functions (gethostbyname and friends) use a different set of error values starting from 1, and the getaddrinfo/getnameinfo functions use another set of error values starting from 1.
Would it be possible for the error_code class to have a category value (which is probably an integer with implementation-defined values) to allow the creation of unique error_code values for each type of system error? A category value would also allow implementors and users to extend the system error classes for other sources of system error (e.g. SSL errors, application-specific errors, etc).
I haven't thought of the case of several "namespaces" or "errorspaces" before, so please take what follows as an initial idea, not something cast in concrete. My initial thought is to leave error_code alone, since it should be fine for most uses. For uses that need more information, such as asio, derive asio_error_code from error_code, adding appropriate members. For example, members to set/get the well known socket error codes. That way users who don't care about the domain specific codes can just use the error_code base member functions, while those that care about the specific codes can use the derived asio_error_code functions. Does that seem a bit cleaner than your approach of adding codes? What is your reaction?
- A common idiom in code that uses asio, when you don't care about a specific error type, is to simply treat the error as a bool:
void handle_read(asio::error e) { if (e) { // take action because of error } else { // success } }
Can the error_code class be made convertible-to-bool and also have operator! to support this style?
Yes, something like that would be very useful. I just made that mistake (forgetting the error() function and assuming convetibility to bool) this morning. The resulting error messages were classics of misdirection - they went on about shared_ptr, of all things, not instantiating correctly! Took me ten minutes to figure out the real problem. So, yes. IIRC, there has been prior list discussion of the best way to provide convertibility to bool. I'll google around and see if I can find it. If anyone else remembers the discussion, please feel free to provide a pointer or a summary.
- Are these classes available in CVS somewhere so they can be reused by other boost libraries?
No. I was waiting for 1.34 to ship, but that is taking a long time so I'll go ahead and update CVS head, hopefully tomorrow. Thanks for trying the error_code approach with asio! --Beman

Hi Beman,
- Callback handlers for asynchronous functions would take an error_code as the first parameter. For example:
void handle_read(error_code ec) { ... }
That makes sense to me. It wouldn't be useful to throw an exception because of the asynchronous nature of the control flow, so supplying a error_code to the callback gives it a chance to deal with the error, or ignore it if desired. Was that your analysis?
Yep, exactly. The current asio::error exception class currently fills the role of the error_code in async callbacks, as well as the system_error in throwing sync functions.
I haven't thought of the case of several "namespaces" or "errorspaces" before, so please take what follows as an initial idea, not something cast in concrete.
My initial thought is to leave error_code alone, since it should be fine for most uses. For uses that need more information, such as asio, derive asio_error_code from error_code, adding appropriate members. For example, members to set/get the well known socket error codes.
Couldn't that introduce a problem with slicing? The error_code is passed by value to functions like system_message(), the system_error constructor etc. I suppose I'm not thinking of the asio errors as having more information as such. From the user's point of view they are system errors like any other. It's just from an implementation point of view that we have these multiple error spaces. This is mainly a problem for UNIX where some of the system APIs don't use errno values. When I was thinking about it before, I wasn't sure whether it was better to make the category part of the interface or simply allow the implementation to add a category as part of the implementation-defined system_error_type. Another aspect that appeals to me with having well-known errors as constants of type error_code, is that it allows most users to treat the error_code as an opaque type. They can write: void error_handler(error_code ec) { if (ec == asio::error::eof) ... } without needing to worry about the implementation-defined value of asio::error::eof, or indeed care what implementation-defined error space it lives in.
- Are these classes available in CVS somewhere so they can be reused by other boost libraries?
No. I was waiting for 1.34 to ship, but that is taking a long time so I'll go ahead and update CVS head, hopefully tomorrow.
Great, thanks! Cheers, Chris

That makes sense to me. It wouldn't be useful to throw an exception because of the asynchronous nature of the control flow, so supplying a error_code to the callback gives it a chance to deal with the error, or ignore it if desired. Was that your analysis?
Is it necessary to differentiate between error codes and exceptions? An exception *is* an error code for me. IOW, it might make sense to derive an error_code class from std::exception. Assume we have two similar functions f1() and f2() with the same possible failure modes. We'd like f1() to throw and f2() not. The prototypes might look like: void f1(arguments...) throw(std::exception); std::auto_ptr<std::exception> f2(arguments...) throw() ; Regards -Gerhard -- Gerhard Wesp ZRH office voice: +41 (0)44 668 1878 ZRH office fax: +41 (0)44 200 1818 For the rest I claim that raw pointers must be abolished.

On 5/24/06, Gerhard Wesp <gwesp@google.com> wrote:
Is it necessary to differentiate between error codes and exceptions? An exception *is* an error code for me. IOW, it might make sense to derive an error_code class from std::exception.
Assume we have two similar functions f1() and f2() with the same possible failure modes. We'd like f1() to throw and f2() not.
The prototypes might look like:
void f1(arguments...) throw(std::exception); std::auto_ptr<std::exception> f2(arguments...) throw();
This doesn't work in the context of asio in particular where there are async functions that return immediately and can only detect/indicate an error condition at a later time (e.g. by passing it to the Handler function). See for example: http://asio.sourceforge.net/boost-asio-proposal-0.3.6/libs/asio/doc/referenc... -- Caleb Epstein caleb dot epstein at gmail dot com

On Wed, May 24, 2006 at 08:59:22AM -0400, Caleb Epstein wrote:
This doesn't work in the context of asio in particular where there are async functions that return immediately and can only detect/indicate an error condition at a later time (e.g. by passing it to the Handler function).
I see. Well, in this case my point would be that the error argument to the handler should be an exception, i.e. a class in some exception hierarchy, preferably the standard one. Regards -Gerhard -- Gerhard Wesp ZRH office voice: +41 (0)44 668 1878 ZRH office fax: +41 (0)44 200 1818 For the rest I claim that raw pointers must be abolished.
participants (5)
-
Beman Dawes
-
Caleb Epstein
-
Christopher Kohlhoff
-
Gerhard Wesp
-
Tomas Puverle