As a parent, one of my favorite activities is bragging about my kids. My daughter Kriscia is a senior in high school, is in the top 10 percent of her class, and works in an ice cream shop. My daughter Adelyn works in a local boutique, and is on the honor roll at our local community college. And my son Carlos is in medical school.
While Carlos was visiting us this summer, I asked him about the process that doctors use to diagnose their patients. As he explained the process to me, I realized that it could also be applied to engineering failure analysis.
The process is called "differential diagnosis." It consists of four main steps. First, the doctor collects the patient's medical history, and makes a list of the patient's symptoms. Next, the doctor makes a list of possible diseases or conditions that might cause each of these symptoms. Third, the doctor organizes this list by priority. The ranking must take into account the likelihood, as well as the potential severity, of each possible condition. Finally, the doctor begins testing for each of these conditions, starting with the top of the list and working down. If the test results rule out a possible condition, the doctor moves on to the next one on the list. Sometimes, test results may prompt the doctor to reprioritize the items on the list.
As my son described this technique to me, it occurred to me that it's very similar to the approach that we as engineers use to diagnose the root causes of mechanical failures.
When somebody brings a broken part to me, one of the first things I do is to familiarize myself with the part's function, the materials and manufacturing process used to make the part, and the conditions under which the part was used. This might be considered to be the part's "medical history." One exercise I like to do, when possible, is to compare the broken part to an unused part from off the shelf to see if there are any obvious differences; this might be called "checking for pre-existing conditions."
Next, I catalog the symptoms of the failure as accurately as possible. This starts at the "big picture" level: how was the failure detected? Did the product suddenly stop functioning, or did its performance degrade over time? Was functionality restored, or was the failure catastrophic? How long had the product been in service when the failure was detected? How was the product being used when the failure occurred? What were the environmental conditions like?
From there, I carefully examine the part itself, and analyze the symptoms on a more detailed level. Is the part twisted, bent, or otherwise deformed? Are there signs of wear or corrosion? If so, where? Is there any discoloration, or other evidence of possible overheating? If the part is cracked, what do the crack surfaces look like? Does it appear to be a ductile, brittle, or fatigue fracture? Are there multiple cracks, or just one? Do the cracks branch? Where do the cracks appear to originate? What direction do the cracks appear to propagate in? Answering these questions might involve looking at the parts with a stereoscope or scanning electron microscope.
After thoroughly investigating and documenting the symptoms of the failure, I put together a list of possible causes. For example, a spalled bearing could be a result of excessive loading, inadequate heat treatment, misalignment, or debris on the bearing surfaces. While putting together this list, I like to work together with a group of people who are knowledgeable about different aspects of the product. As we discuss the symptoms of the failure, someone may suggest a theory that wouldn't have occurred to me. That's the benefit of having multiple viewpoints. Like medicine, engineering is a team activity.
The next step is to organize the list of possible causes in terms of priority. Doctors have a saying: "When you hear hoof beats, think horses, not zebras." In medicine, this saying means to rule out common diseases before considering rare, obscure ones. As applied to engineering, it means that it's usually best to consider simple explanations before complex ones. However, like doctors, engineers also have to consider the potential severity of each possible explanation. It may be that one possibility, if found true, would call for immediate action, such as a production shutdown or a product recall. That possibility, even if considered relatively unlikely, should be put at the top of the list.
Finally, start testing the different possibilities. Like doctors, engineers need to consider the limitations of different tests; some tests can provide false negatives, while others can provide false positives. In engineering, it's important to consider how closely test conditions match up to real-world conditions. Unlike doctors, engineers can perform destructive tests, and usually the goal of testing is to recreate the failure. A doctor who went about testing that way would almost certainly be sued for malpractice, and probably be thrown in jail!
The process of differential diagnosis, as practiced by physicians, provides a framework for thinking about problem solving in many contexts, of which engineering is just one. This conversation with my son helped me to re-conceptualize the problem-solving process that I go through on a daily basis.