
Chris,
I have been thinking about how to reconcile Boost.Asio error handling with the system error classes defined in the TR2 filesystem proposal N1975. I have some concerns which I'll outline in the context of how I plan to use them.
The current plan is as follows:
- Synchronous functions would by default throw a system_error exception.
- An overload of each synchronous function would take an error_code& as the final argument. On failure these functions set the error_code to the appropriate error. On success, the error_code is set to represent the no error state. These functions do not throw exceptions.
Here's a decription of the approach I've taken when designing a portable system library: There were several problems that I was trying to solve: a) I was trying to minimize the amount of boilerplate error handling code which gets written, or even worse, which doesn't get written, because it's so boring to write it! b) Unify the error reporting infrastructure under Unix and Windows based systems. c) Make it easy to handle different classes of errors. An example of this would be the Unix read() call: How many times have you seen people write while (...) { ssize_t res = ::read(...); if (!res) { switch (errno) { case EINTR: continue; case ... } } } So wrt errors, we would like some types of errors to (usually) automatically restart the operation that was attempted, while other "exceptional cases" (like EOF) are not critical probably should not cause an exception, while others should just be handled as critical errors (and e.g. throw) The other problem was that Windows and Unix have a different models of handling errors. On Unix, errors are "syscall sensitive", i.e. there is a limited number of error codes and they sometimes mean slightly different things depending on which system call was invoked. On the other hand, on Windows the error space is "flat" and any system call can essentially return any of the (10000 or so) error values. My design was as follows: 1) I also use an extra parameter, SysResult &, but also provided a second parameter called ErrorHandler. This was a stateless class which would define the behaviour in case of an error. In our programs, I've observed that in most places return codes would get handled in a very similar fashion, so we required only a few different variations of ErrorHandlers. 2) Upon a invocation, the function call would get executed and do its thing. If there was an error, the error handler is to be assigned several values: a) The "library" error code. This was a high level translation of the common error codes into a system independent enumeration (which, btw, was using a "smart enum" class, which would have an associated error string with each value) b) The system error code. This was the error code of the last function that caused the problem (e.g. errno) c) A string that would describe the error in more detail. IMHO it is very important for a system to provide accurate and leggible errors which make it easy to diagnose the source of the problem even if it propagated deep from within some library (instead of just returning "EACESS". What am I supposed to do if the system just logs "EACESS"?) This would be a concatenation of the "Library" error code string, the system error code string (e.g. as returned by strerror() or FormatMessage()) and any other contextual information (e.g. The name of the function called, what was the function trying to do and a suggestion what could be the cause of the problem) This is because for example when mapping a file, one has to go through several system calls to return a memory mapped file object, and one would like to know exactly in what part of that function the call failed. d) The error handler would then get invoked with the SysResult and it could decide based either on the "System independent" or "system specific" error code, what it should do. The error handler would return one of "CONTINUE", "RETRY", "FAIL" enumerations. The first and second are, I think, self-explanatory. The third would cause the surrounding code to trow an exception, similar to system_error. My default error handlers would do something sensible like retry on EINTR and and throw on other error conditions. I also had a non-throwing version. Assuming there was no exception, it was easy to check the result of an operation, just like you describe. This system, while far from perfect, seems to work quite well. It is not without fault though - here are some problems (some of which, it seems, are shared by your proposal above): i) Assuming that your "high-level" system operation is implemented in terms of several system calls, what is a good way handling system-specific error codes? My error handlers had the possibility of handling not only the "high- level" library errors but also use the low-level system errnos. However, as you can see, this breaks encapsulation by making assumptions about how the function is implemented in terms of the underlying syscalls. ii) The same issue I described above applies equaly well to other functions implemented in terms of several "high-level" (library) system calls. To be able to use the SysResult effectively, you would need to know which function generated it, again breaking encapsulation. What do you think? Tom