It is very tempting to assume that a system will behave as perfectly in the field as it does on the engineering bench. During development, embedded software is written under the best of conditions. The developer knows how the system is supposed to work and development usually proceeds smoothly. However, as thousands of devices start to get into the hands of users, the chances that the unexpected will happen becomes statistically more probable. In today’s post, let’s explore the strategies that developers need to write software that can handle unexpected errors.
Strategy #1 - Constantly consider what can go wrong
The first strategy that developers need to embrace to handle errors is to actively question what can go wrong as they write every single line of code. For example, the moment I write the implementation for a function like:
void Dio_WriteChannel(DioChannel_t Channel, bool state)
// Additional code goes here
Developers should consider these questions:
- What happens if the Channel parameter goes out of range?
- Should the function by returning an error code or success flag?
- How do I validate that the desired channel state has changed?
- What do I do if the state tried to change but can’t?
- Can memory become corrupted such that my bool state variable is something other than true or false? If so, how do I handle that condition?
- Is an assertion enough to check boundary conditions at development time or should there be real-time checks on the parameters as well?
That’s a lot of questions for such a simple and common block of code and we haven’t even started to fill in the details! If you want to handle errors successfully, you have to be constantly questioning the code and what could go wrong.
Strategy #2 – Document your concerns and questions using TODO
As software is being developed, there are sometimes more questions that can be immediately answered. In the above example, there may not be an answer yet for how return errors are going to be handled. It would be easy to skip over this problem for now as other coding issues accumulate and forget the problem in the process.
One way to capture problems as they appear is to sprinkle the concerns or questions in the code comments. Most modern integrated development environments (IDEs) will have customizable tags that can be pulled from the code to create a list such as the use of TODO tags. These will show up as informational messages.
If there is an error that needs to be handled but I’m not sure how to do it, I will use the TODO tag. If there is an implementation, but I want to review it, I will probably use a TODO or maybe some other easily searchable keyword. Developers should take care not to overuse the TODO informational message, but it remains a good and easy way to keep track of questions or issues on the fly. It is true that external trackers can be used but I find it’s far easier to keep it within the code so it can be easily seen by developers in code reviewers.
Strategy #3 - Lose the “I’ll go back later attitude”
Many times, I’ve been told; “We know this isn’t the right way to do this, but we will go back later and fix it!”. I’ve heard this from entry-level and experienced engineers who should know better. There is no time like the present to fix something, document it, or implement error checking. There will always be a fire or some issue that is fighting for the developer’s attention. While we may intend to go back and add that error handling later, it seldom happens!
As soon as something appears to work for management, it’s time to move on to the next pressing issue. If it’s working, why would you invest more time in it for diminishing returns? Management doesn’t realize that you didn’t include error checking or that there are gaping holes in the implementation! If the product needs robustness, don’t try to add it later or believe that you can go back and fix it later. Do what must be done while you write the code and then you will sleep better at night knowing there isn’t a hidden error waiting to ruin your week.
The way that a developer approaches writing their software is what determines whether their system will recover from errors gracefully or whether it will blowup metaphorically in the user’s face. The key is having the right development attitude that considers what can go wrong and implementing the recovery mechanism while the software is being written. I often hear teams say they will go back later to fix and handle errors. It rarely happens, which means deploying a disaster that is just waiting to happen. In order to get a better handle on errors, deal with what can go wrong in the moment it arises. If you don’t, it will probably never get handled.
Jacob Beningo is an embedded software consultant who currently works with clients in more than a dozen countries to dramatically transform their businesses by improving product quality, cost and time to market. He has published more than 200 articles on embedded software development techniques, is a sought-after speaker and technical trainer, and holds three degrees which include a Master of Engineering from the University of Michigan. Feel free to contact him at firstname.lastname@example.org at his website www.beningo.com, and sign-up for his monthly Embedded Bytes Newsletter.