Error Conditions in Code


You must handle all error conditions in your code. Error handling must be taken seriously.

Bugs occur when the code does not handle the error condition. Error conditions are distinct from bugs because they can and always will occur. The database file you want to write to could have been deleted, the disk could fill up, or a web service may not be available.

An error may fall into three categories: user error, programmer error, and exceptional circumstance.

A good program will handle a user error by pointing out the mistake and helping the user rectify it.

Programmer errors should ideally never occur. The user can do nothing about this. Defensive programming is important to stop the cycle of unhandled errors causing further error conditions.

Exceptional circumstances include things like the network connection failing or the hard disk running out of space.

Errors are raised by subordinate components and communicated upward to be dealt with by the caller. In general, to take control of program execution, we need to have the subordinate components raise an error when something goes wrong. We need the caller to detect all possible errors and handle them appropriately, and if it cannot handle them, propagate the error upward to be handled by a different caller.

Locality of error


An error is local in time if it is discovered soon after it is created. An error is local in space if it is identified very close to the site where it manifests.

Never simply ignore an error condition. If you do not know how to handle the problem, then signal a failure back up to the calling code.

A simple mechanism is to return a success/failure value from your function. A more advanced approach would be to enumerate all of the possible statuses and return a corresponding reason code. One value means success and the others are abortive cases. This can get messy for functions that need to return data. A different approach may be set a global variable which can be inspected to see if things worked. This can cause confusion and be a source of bugs.

Exceptions are a language facility for managing errors. When your code encounters a program that it cannot handle, it throws an exception. The run time then steps back up the call stack until it finds some exception-handling code. What happens after the exception is handled may vary and there are two operational models: the termination model and the resumption model.

Resilient code must be exception safe. There are different levels of exception safety: basic guarantee (if an exception occurs, it will not leak resources), strong guarantee (no object is altered), nothrow guarantee (an operation can never throw an exception).

Signals are a more extreme reporting mechanism usually for errors sent by the execution environment. The operating system traps a number of exceptional events and delivers them to the application. The program could receive a signal at any time and must be able to cope with it.

Never ignore any errors that might be reported to you.

Two schools of thought for handling errors are to handle them ASAP or as late as possible. The ASAP approach is a self-documenting code technique.

The best way may to handle any error in the most appropriate context: as soon as you know enough about it to deal with it correctly.

Logging is one reaction to any error. Any reasonably large project should be employing a logging facility. It allows collecting important trace information and investigation of nasty problems. The log exists to record interesting events. All errors should be detailed because they are some of the most interesting and telling events.

For problems that only the user can fix, the problem should be reported immediately to the user so that they can resolve the situation. This does not apply to a deeply embedded system, e.g. a dialog box is not going to pop up on a washing machine.

Ignoring errors does not save time. This may be the major cause of bugs in software packages. Far longer will be spent working out the cause of bad program behavior later.

When propagating an error upward, there are two ways to do it. You might export the same information you were fed, or reinterpret the information and send a more meaningful message to the next level up.

Summary


Good programmers are thorough and write the error-handling code as they write the main code. Bad programmers take a haphazard approach to writing code, don't think about or review what they are doing, and ignore errors that arise. Bad programmers end up conducting length debugging sessions to track down crashes due to not considering error conditions in the first place.

Article notes

What are the three categories of errors according to Code Craft?
What will a good program do in response to a user error?
What programming technique helps stop the cycle of unhandled errors causing further error conditions?
What kind of errors should ideally never occur?
When something goes wrong in a subordinate component, what should it do?
What should a caller do if it cannot handle an error propagated upward by a subordinate component?
What does it mean for an error to be local in time?
What does it mean for an error to be local in space?
What is an effect on the code by handling errors ASAP?
When a caller cannot handle an error propagated to it and needs to propagate it up, what are the two ways it can do that?
What is the part of the code that handles errors?
What do bad programmers end up doing when they do not consider error conditions in the first place?
Previous Next