Re: [boost] [asio][filesystem] system error handling

23 May 2006

      Chris,
...
I have been thinking about how to reconcile Boost.Asio error
handling with the system error classes defined in the TR2
filesystem proposal N1975. I have some concerns which I'll
outline in the context of how I plan to use them.
The current plan is as follows:
- Synchronous functions would by default throw a system_error
  exception.
- An overload of each synchronous function would take an
  error_code& as the final argument. On failure these functions
  set the error_code to the appropriate error. On success, the
  error_code is set to represent the no error state. These
  functions do not throw exceptions.
Here's a decription of the approach I've taken when designing a portable 
system library:

There were several problems that I was trying to solve:
a)  I was trying to minimize the amount of boilerplate error handling code 
which gets written, or even worse, which doesn't get written, because it's so 
boring to write it!

b)  Unify the error reporting infrastructure under Unix and Windows based 
systems.

c)  Make it easy to handle different classes of errors.

An example of this would be the Unix read() call:

How many times have you seen people write 

while (...)
{
  ssize_t res = ::read(...);

  if (!res)
  {
    switch (errno)
    {
      case EINTR: continue;
      case ...
    }
  }
}

So wrt errors, we would like some types of errors to (usually) automatically 
restart the operation that was attempted, while other "exceptional cases" 
(like EOF) are not critical probably should not cause an exception, while 
others should just be handled as critical errors (and e.g. throw)

The other problem was that Windows and Unix have a different models of 
handling errors.  On Unix, errors are "syscall sensitive", i.e. there is a 
limited number of error codes and they sometimes mean slightly different 
things depending on which system call was invoked.  On the other hand, on 
Windows the error space is "flat" and any system call can essentially return 
any of the (10000 or so) error values.

My design was as follows:

1)  I also use an extra parameter, SysResult &, but also provided a second 
parameter called ErrorHandler.  This was a stateless class which would define 
the behaviour in case of an error.  In our programs, I've observed that in 
most places return codes would get handled in a very similar fashion, so we 
required only a few different variations of ErrorHandlers.  

2)  Upon a invocation, the function call would get executed and do its thing.  
If there was an error, the error handler is to be assigned several values:  
  a) The "library" error code.  This was a high level translation of the 
common error codes into a system independent enumeration (which, btw, was 
using a "smart enum" class, which would have an associated error string with 
each value)
  b) The system error code.  This was the error code of the last function that 
caused the problem (e.g. errno)
  c) A string that would describe the error in more detail.  IMHO it is very 
important for a system to provide accurate and leggible errors which make it 
easy to diagnose the source of the problem even if it propagated deep from 
within some library (instead of just returning "EACESS".  What am I supposed 
to do if the system just logs "EACESS"?)  This would be a concatenation of 
the "Library" error code string, the system error code string (e.g. as 
returned by strerror() or FormatMessage()) and any other contextual 
information (e.g. The name of the function called, what was the function 
trying to do and a suggestion what could be the cause of the problem)  This is 
because for example when mapping a file, one has to go through several system 
calls to return a memory mapped file object, and one would like to know 
exactly in what part of that function the call failed.

  d) The error handler would then get invoked with the SysResult and it could 
decide based either on the "System independent" or "system specific" error 
code, what it should do.  The error handler would return one 
of "CONTINUE", "RETRY", "FAIL" enumerations.  The first and second are, I 
think, self-explanatory.  The third would cause the surrounding code to trow 
an exception, similar to system_error.  My default error handlers would do 
something sensible like retry on EINTR and and throw on other error 
conditions.  I also had a non-throwing version.

Assuming there was no exception, it was easy to check the result of an 
operation, just like you describe.

This system, while far from perfect, seems to work quite well.  It is not 
without fault though - here are some problems (some of which, it seems, are 
shared by your proposal above):

 i)  Assuming that your "high-level" system operation is implemented in terms 
of several system calls, what is a good way handling system-specific error 
codes?  My error handlers had the possibility of handling not only the "high-
level" library errors but also use the low-level system errnos.  However, as 
you can see, this breaks encapsulation by making assumptions about how the 
function is implemented in terms of the underlying syscalls.

 ii)  The same issue I described above applies equaly well to other functions 
implemented in terms of several "high-level" (library) system calls.  To be 
able to use the SysResult effectively, you would need to know which function 
generated it, again breaking encapsulation.

What do you think?

Tom

Re: [boost] [asio][filesystem] system error handling

Tomas Puverle