When I started my career in electronics, one of my teachers emphasized that your senses can be just as valuable for troubleshooting as the instruments you use. Later, during my many years working as a test engineer at a major ATE supplier, I found this lesson to be true. The following case is a perfect example.
I was based in a field office that was also home to one of my company’s semiconductor testers. This tester had multiple uses -- application development, sales demos, customer training -- and, as such, was configured for multiple roles with many of its card cages completely filled. Three of us field engineering types were charged with supporting this tester when we were not out trotting the globe fixing customer-owned units.
A bit of background is important, here, in order to understand the service issue. To have the speed and timing accuracy necessary to test the latest types of semiconductor devices, these testers used a great deal of ECL logic. Large banks of cinderblock-sized switching power supplies were needed to provide the various voltages that the logic required. Although these were in the range of - 5.2V to +5V, currents could exceed 200 amps. The power supplies were connected to the card cages using heavy gauge cables and distributed to the card cage slots using insulated bus bars that ran across the backplane.
When the tester was not in use, we would run a diagnostic program loop that sequentially exercised each function in the tester and logged any problems. When a problem started to develop in one of the card cages, it would generally start with one board and, eventually, the entire subsystem would fail. The problem was highly intermittent. In one instance, we removed and reseated the board that was detected as having an issue; the problem appeared to go away for a week or two, and then returned.
One of us decided to replace the suspect board and, again, the problem appeared to go away for a few more weeks. During this entire process, each of us had repeatedly and completely vetted the power supplies for proper voltage and lack of excessive noise. Each time the problem returned, we would check the power supplies again; everything appeared to be fine.
One day, I happened to be the lucky person working on this problem and I did something I normally would not do when working on a high-current power component -- I touched the power block on the card cage and found it to be quite warm. This ended up being the key to the problem.
When I poked through the bus bar insulator and took a reading, I found that there was a large enough voltage drop to disturb a proper ECL function. As it turned out, the power block where the power cable was connected and the bus bar were not a single piece of metal, as one might expect. In fact, they were screwed together. In this case, the mechanical connection was not tight enough, producing a small resistance which was then multiplied by the 200 amps to create the voltage drop seen by the ECL circuitry. This explained what had happened before.
When we reseated or changed boards, we slightly flexed the backplane, restoring the proper connection for a week or so. As the connection would deteriorate, the ECL devices would begin to fail with some being more sensitive than others. The final fix required a major disassembly of the subsystem, but it was a relief to find a solution. After that incident, whenever I worked on that type of tester, I took a minute to touch all of the power blocks to verify that none were hot. I guess that teacher was right.
This entry was submitted by David Laing and edited by Rob Spiegel.
David Laing is a 30-year veteran of the semiconductor test industry. He is currently a senior analyst and program manager in VDC Research Group's M2M Embedded Hardware practice.
Tell us your experience in solving a knotty engineering problem. Send stories to Rob Spiegel for Sherlock Ohms.