[wsf-c-dev] Error handling

James Clark james at wso2.com
Thu Dec 7 01:44:43 PST 2006


There are a number of factors that combine to make our current error
handling strategy painful.

1. Resources have to be cleaned up manually when an error is detected.
The sort of pattern described here is common:

http://www.freetype.org/david/reliable-c.html#paranoid

It's bad enough to have to check every function call for an error.  But
having to free exactly the right set of resources each time a function
call returns an error makes it much worse.  C++ destructors provide one
solution.  Apache-style pools are other.  The solution needs to be part
of the memory management discussion, but whatever we need for memory
management, I think a hard requirement is that it provide a way to avoid
manually cleaning up resources on an error.

2. Even normally fatal errors are handled by returning an error code. In
a typical command-line application, an out of memory error would be
handled by printing an error message, possibly freeing up some resources
(such as temp files) and then exiting.  Other errors that the
application can't typically do anything about such as assertion failures
are handled in a similar way.  Of course, in a general-purpose robust
library, we can't simply exit when we encounter this kind of error.
However, handling this kind of error by returning an error code is
awkward.  Since there is the potential for this kind of error in almost
every function (it's possible for almost any function to be supplied
with an invalid argument), this means that almost every single function
has to return an error code. Yet in almost all cases the only thing that
the caller can do with the error code is pass it back up to its caller
(after freeing up any resources that it allocated).

So I think we need a better way of handling fatal errors.  What I would
suggest first of all is that we add to the environment a fatal error
handler, which would be called when a fatal error (such as out of
memory) is encountered. The fatal error handler would be required not to
return.  The user can set the fatal error handler to do whatever they
want. There would be a default handler that would do something sensible
like print a message to stderr and exit.

Next, we would provide facilities for trapping fatal errors.  This would
be used sparingly.  For example, in a server context, we might trap
fatal errors at each the point where the axis2 engine is invoked to
handle a message, so that we can recover by aborting the processing of
the message that caused the "fatal" error.  We wouldn't trap the errors
at each point where we wanted to free a resource: that would be handled
by our resource cleanup machinery.

This fatal error trapping mechanism might take the form of a macro
AXIS2_TRAP which would be used like this:

void foo(axis2_env_t *env)
{
  axis2_status_t status;
  AXIS2_TRAP(status, env, bar(env));
  if (status == AXIS2_FAILURE) {
     // handle the fatal error that's been caught
     // details of the error are in env
  }
}

One possible portable implementation of the AXIS2_TRAP macro would be in
terms of setjmp/longjmp, something like this:

#define AXIS2_TRAP(status, env, stmt)					\
    do {								\
      jmp_buf jb;							\
      void (*prev_handler)(axis2_env_t *) = env->fatal_error_handler;	\
      void *prev_handler_data = env->fatal_error_handler_data;		\
      env->fatal_error_handler_data = jb;				\
      env->fatal_error_handler = axis2_longjmp_fatal_error_handler;	\
      if (setjmp(eh.jb) == 0) {						\
	stmt;								\
	status = AXIS2_SUCCESS;						\
	env->fatal_error_handler = prev_handler;			\
	env->fatal_error_handler_data = prev_handler_data;		\
      }									\
      else								\
	status = AXIS2_FAILURE;						\
    } while (0)


void axis2_longjmp_fatal_error_handler(axis2_env_t *env)
{
  longjmp(env->prev_handler_data, 1);
}

Note that setjmp/longjmp don't typically work with C++ destructors, so
we wouldn't be able to do this if we wanted to use C++ destructors as
our resource cleanup mechanism.

On Windows, an alternative implementation of AXIS2_TRAP would be to use
Windows SEH (structured exception handling).

Obviously, this sort of error handling assumes we have a solution to
resource cleanup problem I described above.

There are a couple of other frills that I think would be useful:

- a way to trap only a specific set of fatal errors

- an easy, efficient way for a function to augment the information in an
exception thrown by a function that it calls without having to trap and
rethrow it (e.g. a function should be able to turn an "out of memory"
error into a "out of memory while processing message id XXX from IP
address 1.2.3.4" error).

3. The compiler does not tell us if we forget to check the return value
of a function that returns an error code.  This is solvable by using the
gcc warn_unused_result attribute and by having a convention that *all*
functions that can return an error return axis2_status_t. In particular,
we don't have functions returning a NULL pointer on failure (with the
fatal error handling scheme above, this should become much rarer). Then
we can have a macro

#ifdef __GNUC__
#define AXIS2_MAY_FAIL __attribute__((warn_unused_result)) axis2_status_t
#else
#define AXIS2_MAY_FAIL axis2_status_t
#endif

Then we just declare our functions as

AXIS2_MAY_FAIL axis2_foo(axis2_env_t *env,...);

instead of

axis2_status_t axis2_foo(axis2_env_t *env,...);

If we do this, then gcc will give us a warning whenever we forget to
check for an error condition.

James






More information about the Wsf-c-dev mailing list