Using the recent Costa Concordia disaster, framed up against the lessons learned from the infamous Titantic disaster, is a perfect "teachable moment" for proving out the importance of failure analysis as part of upfront design. I would hope the takeaway from Professor Petroski's thoughtful post is that failure analysis needs to be a proactive part of the principal design process, not simply an after-the-fact exercise that comes on the heels of any kind of related disaster or product failure. On the upside, I would think the flurry of more accessible CAE and simulation tools can greatly aid engineers in this very important exercise.
This quickly turns into a discussion of ethics. "Proactive Failure Analysis" should have predicted the Tylenol poisonings which created the safety sealing industry. A broken red lens on a train signal should have predicted the folly of using "white means go" for traffic signals. O-Ring elasticity measurements should have prevented the loss of the Challenger, and research into the toxicity of poly-alcohols (sugars) should have pointed to an epidemic of diabetes and heart disease.
We should always be mindful of risk analysis, but unless we are willing to live in the society described in Minority Report, perhaps the best thing we can do is learn the truth of our mistakes quickly and incorporate those lessons as we continue to evolve our technology...
@williamlweaver- The O-rings on the Challenger were fully tested and the temperature range was known. Concerns and objections to launching in conditions outside of spec were ignored or overridden. But, Titanic and Challenger have many parallels in both the technical and human aspects.
Rivets recovered from the Titanic have a higher than normal carbon content. I don't recall if this was by mistake, or to save money. I'm a EE but I believe that the high carbon would make the rivets brittle in cold temperatures.
@kenish: You can find a fairly detailed metallurgical report on the sinking of the Titanic here. If you're not a metallurgist, you may want to skip directly to the conclusions.
You're right that the steel was brittle at cold temperatures; in fact, it would have even been brittle at room temperature. The ductile-to-brittle transition temperature was between 100 and 140°F, compared to about 10°F for comparable modern steels.
The issue was not the carbon content, so much as high levels of sulfur combined with low levels of manganese. However, the author of the report concludes that nobody at the time would have known this was a problem.
The ductile-to-brittle transition in steels was not well understood until after World War II, when a large number of U.S. merchant marine vessels sank as a result of low temperature brittleness.
The Titanic disaster is a result of too many flaws stacked up to survive. Eliminate any one of the flaws, and the result would be vastly different. Had the bulkhads been full-height, the progressive flooding would not have occurred. Had there been enough lifeboats, the loss of life would have been minimal. Had the captain chosen to slow down in poor visibility conditions, the collision could have been avoided or been inconsequential.
Professor Petroski's point that lessons are learned from the failures is accurate: the SS Andrea Doria / MS Stockholm disaster proves it. The Andrea Doria was unable to launch half her lifeboats due to severe listing, but the ship's better design permitted it to stay afloat for 11 hours, allowing other ships to arrive in time to rescue all aboard. Only the design helped; having enough lifeboats wasn't enough to prevent loss of life.
There were lessons not learned though: the collision still occurred in low visibility conditions.
Even TODAY, ship collisions occur in fog, despite shipboard radar and GPS technology:
A big part of my work involves analyzing failures after they happen. (Fortunately, most of the time they're failures which occur during in-house testing, not in the field). The rest of my work involves trying to apply these lessons to prevent failures from happening in the first place.
It's not only important to learn from failures -- it's important to learn the correct lessons from failures. Whenever you are trying something new, if anything goes wrong, the knee-jerk reaction is to blame whatever is new in the design. For example, if you are trying out a plastic bearing and you experience a failure, many people will conclude that plastic bearings are simply no good. Then, for the next decade, if anyone suggests using a plastic bearing, the response will be, "We tried that before and it didn't work."
There's also a tendency to throw the kitchen sink at a problem. Often, when a part fails, different engineers may suggest three or four possible changes to the design or manufacturing of the part which might (or might not) fix the problem. There may be pressure from management to "just get it fixed," which might mean implementing all of the changes at the same time and hoping one of them works.
Sometimes, this results in the problem going away, and the engineers look like heroes to management for solving the problem in such a timely manner. But really, nothing has been learned. Worse, in the tribal knowledge of the engineering organization, one of the changes which actually had no effect may undeservedly get the credit for fixing the problem. Once this becomes part of the conventional wisdom, it may prove very difficult to dispell.
Finally, I'd like to point out that wishful thinking is not a successful design strategy. Occasionally, when a failure occurs, people try to convince themselves that the conditions under which it occurred were so unique that it will never repeat itself. This is especially true of failures which occur in testing, where there's a tendency to attribute any difficult-to-explain failures to the test method and assume they won't happen in the real world. If that's really the case, then you should try to come up with a more representative test method. But in any case, if you're arguing that a failure should be ignored, you'd better make damn sure you're right.
Indeed! Many years ago, when we were first trying out fiber optic networks on our plant control system, our technicians noticed that many nodes would go offline at random. Standard practice of swapping parts and reterminating connectors didn't seem to help.
So I went out to the site armed with a function generator for sending pulses of variable duty cycles, and an oscilloscope. I knew from having looked inside that the fiber-optic converters had no processors of any sort inside. The plant was filled with all sorts of people. There were construction projects going on, system demand was high, and stress levels were going through the roof.
I found a quiet corner, grabbed one of the units, and tested it. Sure enough the pulses from a loop-back cable were distorted to the edge what should have been readable. I then grabbed another unit with the similar serial numbers and production run from working stock and I tried it out. The pulses were clean.
I looked inside the units and discovered that some units had a DC blocking capacitor in series with the data connection with one tenth the value of the working units. The problem units had serial numbers indicating a production run at the same time as the good units. My best guess is that someone had probably mixed these values in a parts bin.
I explained this to my boss and he promptly declared that we were going to use a different brand. So instead of a few hours to correct the problem boards, we ended up spending many more weeks with another product that also gave us significant headaches of a different sort. From that point we were very reticent to use fiber optic cable systems even though we knew theoretically it should have worked very well for us.
That was a case of not learning the right lessons from failure.
Failure mode effect analysis (FMEA) is done early in the design phase along with root cause analysis. In the example of the ship grounded, not sure why the captain decided to sail closer to the shore, however, auto navigation should have shown the safe level for the correct depth.
Excellent article, Prof Petroski. I read an article recently about the Titanic's maiden voyage. Apparently the crew was warned there were icebergs in the area by a ship that had just moved through the area the Titanic was heading toward. The crew of the Titanic even acknowledged the warning.
"Expect the best, but prepare for the worst" certainly seems to apply in this situation. While we can not prepare for scenarios yet unimagined, it certainly makes sense to prepare for known risks. However, the human element will always be an uncontrollable variable. I designed a test set that required a cylinder to come down with some force over the test bed. In order to prevent the operator from inadvertently getting a finger smashed, I used two thumb switches and programmed it so that both switches must be blocked in order for the test to run and the cylinder to actuate. One of the engineers informed me after their visit to the plant it was installed at, that the workers had simply stuck a glove to block the thumb switch sensor on one side - totally negating my built in safety design. I guess I should have anticipated the human element, but one can only go so far in anticipating the actions of those who will be operating the equipment...
Interesting about the operators adding the glove to fool your interlock. Most of the companies purchasing systems from us always mandate "anti-tiedown/ anti-repeat" functionality for the two handed safety system. That means that when the first button is operated there is only a short window of time during which the second button can complete the initiating circuit. Also, after the initiation function is delivered, both buttons must be released in order for another trigger to be generated. So jamming one button down would inhibit all of the machine operation.
William - I certainly agree with your company's measures. At the time I was working in a test engineering department where we built test sets only for internal customers to test our product lines. Actually, the thumb switches were an innovation that I thought of when trying to brainstorm how to keep our operators safe. As a first rattle out of the box it wasn't a bad idea, but as you mentioned there are additional programming measures that can ensure a more fool proof operation. I guess I was naive as to the extent an operator would go to defeat my "safety feature" at the time...
@ChasChas: Engineering judgement means knowing what to worry about and what not to worry about. Sometimes, engineers spend an inordinate amount of time and energy worrying about a potential failure mode which is absurdly improbable, while ignoring other failure modes which are much more likely.
Unfortunately, I don't know of any reliable method to judge which failure modes are most likely, other than experience. When in doubt, it's probably better to investigate, rather than brushing off anything lightly.
Although it has been mentioned, FMEA, (failure mode effects analysis), if done adequately, is a very good risk reduction tool. But to be effective it does need to consider every part of the system that could fail in any way, as well as all of the potential modes of failure. So a rigerous FMEA is a big deal, not a small exercise. For the Titanic, evidently there was no anticipation of the possibility of a hull breach in the area of the dividing bulkhead, which lead directly to the sinking. If the same mechanism had been used to divide the ship into three or four segments it would most likely have saved the ship, or at least bought a lot more time.
ON the other side, there is almost no way to prevent a lrge enough human error from causing some kind of failure. I like the saying about the difference between wisdom and stupidity: "there are limits to wisdom, but there are no limits to stupidity". I am certain that it is absolutely correct.
Your Titanic discussion is relevant as I believe that the threat was thought to have been designed for, a bow collision rather than scrapping along the hull, but even then, the first four bulkheads. It seems from the shows that I've seen, that the engineers' data indicated that most collisions were rammings that required a strong box and compartmentalized bulkheads to control leaks resulting from bow collisions. I would guess that shipbuilder and salvage experience supported this view, but this wasn't discussed to any great extent.
Therefore, the question would be whether the designers data indicated that compartmentalizing the first four bulkheads provided the required protection because water filling at the fifth would result in sinking regardless due to forces that could not be dealt with or that they just didn't consider compartmentalizing past the fourth.
My point is more about what the thought process was behind the design and what testing was possible at the time of design. The designers of the Titanic had the history of shipping and sinkings to consider and probably little in the way of actual testing. This contrasts to Ford and the Pinto in which Ford engineers uncovered the gas tank rupture problem in pre-production crash testing and management chose to ignore their concerns and design changes, for profit motives. Of course, White Star Line management chose to reduce the number of life boats for cosmetic reasons.
Interesting comment about the Ford Pinto gas tanks. I can tell you for certain that Ford now takes fuel tank integrity very seriously. I have watched a lot of their crash testing and the very first thing checked after a crash is fuel tank integrity. Also, even in a higher speed crash, 45 MPH or so, their fuel tanks don't even leak, let alone rupture. So it appears that they did learn a lot from their experience.
As a former Ford Pinto owner, it's good to know Ford is putting that emphasis on gad tank integrity. I was in a rear-end collision in my Pinto. I was fortunate the gas tank was OK. Ford seem to have a pretty good safet run, that is until its Firestone problem.
Anticipating failures and preventing them is obviously one of the design goals of the conscientious engineer. This is not always possible, however. When exploring unknown territory, one must assume that not all failure modes can be, or have been, be anticipated. An engineer I worked with about 40 years ago coined the phrase "fail graceful" as an alternative to "fail safe." In other words, regardless of whatever causes a failure, try to design the system so that the results of the failure will be as benign as possible.
The FMEA is not only about preventing failures, it is primarily about making the system avoid a disaster when something fails. It goes right along with Fudds Third Law of Opposition: "If you push anything hard enough, it will fall down". Comonents will fail, the goal is to avoid injury and minimize damage.
Successful change comes, not from emulating success and trying to better it, but from learning from and anticipating failure, whether actually experienced or hypothetically imagined. This should be the underlying aspect of every design. Petroski as usual at his best ..!
vimal, your post hits the nail on the head. This is an interesting story. It also reminds me of the slogan of the equities industry, "...past performance is not an guarntee of future performance...", or something like that.
Excellent post Vimal. You are right on the money. In my previous industry, we were doing FMEA (failure modes and effects analysis) on major projects for about the last 10 years. Some of this was required by our customers. Unfortunately, industry in general does not seem to be that specific.
Great point Jack. The company I retired from (GE) required FMEA work on all programs, new and for products getting nothing but a "face-lift". The thought being--"if you touch it, make it better". We engineers screemend and moaned due to the time it generally took to perform a good FMEA, but in the end, we produced a better product for the effort. I notice that there are those times that UL ( Underwriters' Laboratories ) require a FMEA also, which I think is very interesting. Bob Jackson, PE
perfect perfect Replica Watches perfect readability and appealing. The best quality Replica watches could be use for: - As a perfect memorable gift, Christmas, Birthday, Friendship Anniversary and more events) because original quality replica watches could be use for lifetime! - lifetime! - beats by dre wireless cheap lifetime! - Ideal for both businessmen and ladies because by warning a replica watch, could bring you excellent For Sport mans, there are hundred of sport activities available, in many of those need excellent time time ibeats by dre cheap time calculation. For example stop watches. Original replica watches cannot ever duplicate. When both original and duplicate watches your hand you can recognize easily either its original or duplicate product, which has manufactured in standard standard Breitling replica watches standard quality and technology, however when you are choosing a watch, you should give your attention to readability and its design to match your fashion. For more latest information, you can visit the below link link christian louboutin pumps on sale link and this site offers more information about various range of details including weekly updates. The Rolex II Watches are sporty but have the luxury to fit almost any situation. It a watch that that discount jimmy choo bridal shoes that is sporty but also smart and dressy for those special occasions that come up in life. it a Rolex replica watch the price does not match the usual Rolex. Still it has the the Replica Watches On Sales the craftsmanship and style of the original watch but not the price. If you want the real but cannot afford it you will love the Rolex Explorer II Watches. They have style and are are Christian Louboutin Pumps On Sales are well made for almost any occasion. The detail on these watches is amazing. It has top automatic Japanese movement for time precision. A 440 stainless steel case and band for durability. It has has Bvlgari replica watches has a date display and the green hologram sticker behind the watch. It is water resistant and sapphire crystal glass so it does not scratch. It is tough so you can tell time and and omega replica watches and never have to
Well explained..! past performance does not gaurantee future success and performance. Had Napolean been an engineer he would have definitely procliamined "Complacency is not found in the dictionary of the engineer". By anticipating failure engineering design will repay the faith of the stakeholders.
It's worth noting that a new book, "Creating Innovators: The Making of Young People Who Will Change the World" cites the willingness to take calculated risks and the willingness to learn from failure as two of the keys to developing innovative minds.
Festo's BionicKangaroo combines pneumatic and electrical drive technology, plus very precise controls and condition monitoring. Like a real kangaroo, the BionicKangaroo robot harvests the kinetic energy of each takeoff and immediately uses it to power the next jump.
Design News and Digi-Key presents: Creating & Testing Your First RTOS Application Using MQX, a crash course that will look at defining a project, selecting a target processor, blocking code, defining tasks, completing code, and debugging.
Focus on Fundamentals consists of 45-minute on-line classes that cover a host of technologies. You learn without leaving the comfort of your desk. All classes are taught by subject-matter experts and all are archived. So if you can't attend live, attend at your convenience.