# The Adventure of the Uncertain FailuresThe Adventure of the Uncertain Failures

DN Staff

August 17, 2009

By Prem Sobel , Contributing Writer

Basic physics provides the insight to solve the problem of a 40% failure rate on final test.

During the summer between my bachelor’s and master’s degree I was fortunate to get a temporary job at IBM in their reed switch test department in Essex Junction, VT, with the title Associate Engineer. At this time, IBM used reed switches by the tens to hundreds of millions every year.

A reed switch will close if a sufficiently strong magnetic field is applied to it (either by a permanent magnet or if inside a coil with a large enough current passing through it).

This job gave me valuable practical experience and helped pay for my first car. At the end of the summer, my manager had very little left for me to do so, someone (perhaps his boss) suggested that I work on a very expensive problem they were having. The reed switches being manufactured were having a 35 to 40 percent reject rate at final test. This was a very expensive loss, but over the years no one had figured out the cause. The switches were rated to close with a certain range of current in a particular fixed coil. The expected behavior is a normal distribution centered on the expected average behavior.

Instead, the mean was always displaced to a much higher current, causing a high reject rate, for unknown reasons.

I asked for and received about 500 “failed” reed switches and a standard coil. The first thing I did, knowing the expected range of current to close a switch, was to go in the laboratory and measure the actual current required. To my surprise half of these so-called failed switches were good and half were bad. I then measured these two groups again to be sure, and found that half of the now good ones were bad and half were good. Similarly half of the bad ones were now good and half were bad in second test.

I asked myself what in the physics of the situation possibly explained this 50-50 split each time, and had my answer: When a reed switch is picked up to be tested, there is a 50-50 chance that the switch will be physically oriented the same way or the opposite way as the previous time it was measured. The next question was “How did this explain the excessive failures?”

The answer is: magnetic hysteresis. When a switch is tested after being physically rotated from its previous position, the magnetic hysteresis, i.e. the residual magnetic field left in the metallic reeds, had to be overcome by a larger current before the switch would close. In a real physical situation this could never happen because the reed switch did not change its physical orientation.

The solution I proposed, which worked very well, was the following: During test, to eliminate the magnetic hysteresis before measuring the closing current, first pass a current pulse at say twice the rated current (to create the accumulated hysteresis in an actual system) before measuring the current required to close a given switch. The failure rate plummeted to near zero with a savings of more than \$1,000,000 per year.

Contributing Writer Prem Sobel’s first job was in the flight computer group at NASA/JPL (he has a program in orbit around Mars), and then after helping found Vitesse he began to switch his emphasis to software after realizing the challenge for parallel computing was in the software (not the hardware) - see http://www.linkedin.com/in/premsobel. You can reach him via our Sherlock Ohms blog comments at www.designnews.com/Sherlock.