An ASIC test board is failing and with over 4,000 mechanical contacts on it, fingers are pointing strongly at the hardware team
By Charles Glorioso, Contributing Writer
I was working in the hardware support group at a company that was developing an image processing ASIC for use in consumer digital cameras. Most of the staff was working on the ASIC logic, while only a few of us were designing support hardware for things like test fixtures. One of the important fixtures was a test platform on which to verify the ASIC equations before the design was “taped out” and released to the foundry.
The test platform, which was designed before I joined the team, carried 4 enormous (expensive) FPGAs. The idea was that these 4 components provided enough gates to emulate the final ASIC.
The FPGAs were packaged as BGA parts, 35mm on a side with over 1,000 pins each. The board designer had decided that these parts were too valuable to risk soldering to a not yet debugged PCBA, so the design included sockets for each of the four components. Each socket consisted of over 1,000 “pogo pins” at the BGA pitch, and a clamp assembly that was supposed to hold the the part down against the pogo pins.
In order for this test board to work, about 4,000 mechanical contacts had to be made correctly and reliably. No opens and no shorts between adjacent pins.
The first PCBA was assembled, the first four FPGAs were installed, the first ASIC code was loaded, and surprise! It didn’t work. The ASIC team blamed the failure on the hardware team, and the hardware team asked the ASIC team how they could be sure their code wasn’t defective. Sound familiar?
Eventually, the hardware team accepted the possibility that the problem was in hardware and removed the FPGAs, cleaned everything and reassembled the set of four.
Now, the code didn’t work with different failures. It appeared that no matter how carefully the FPGAs were installed, there were at least a few shorts and/or opens. After all, we are talking about 4,000 contacts. The ASIC team volunteered that they could code around a few failed contacts, if they knew where they were.
Here was the mystery: How do you quickly determine which of 4,000 pins are open, which shorted, and which making reliable contact? Fortunately, since this was a test platform, every pin on each FPGA was brought to a land on the PCB, and it was possible to electrically touch every pin.
The first approach to detecting opens and shorts was to put code into the FPGA that put out a square wave on every pin, with different frequencies on adjacent pins.
A technician then used a scope with a shunt resistor across the probe to touch every pin and look at the signal. A full height, clean square wave indicated a good contact on that pin, a weak or no signal indicated an open pin, and a signal with more than one frequency showing indicated a short between pins. The technician would place the probe on a land, look up at the scope, decide the state of the pin, and then move on to the next pin. This was a reliable test except that it took about 5 hours to touch 4,000 pins while looking to and from the scope.
When testing the board, the technician would often find a cluster of opens in one area of the FPGA. He would adjust the clamping pressures around the periphery of the chip and measure again. Days were spent this way with the opens and shorts being pushed around the FPGA.
A faster scheme was needed.
Being an old engineer, I remembered the point in my career at which I realized that the 1MHz clock I was using to activate my circuits was actually in the middle of the AM broadcast band.
I brought in a transistor radio and a pair of headphones. I set the radio on the scope probe cable, and with the radio adjusted between stations, I could clearly hear a loud tone when I touched a good pin, a much weaker tone from an open pin, and a complex squeal from a shorted pin.
Now, the technician could test a full test platform in under an hour. All he needed to do was listen as he dragged the scope probe across the lands on the PCB. Tones rang out in his ears, clearly indicating the state of the contact without his having to look at the scope. Occasionally he would hear an ambiguous tone and would have to look at the scope, but 99% of the time his ears did the job.
In half a day or less, the technician could adjust the clamps to maximize the good contacts, and then create a table of the 10-15 bad pins, and the ASIC team could use that test platform to verify their code.
Eventually, enough confidence was gained with the test platforms themselves, so that some of the sets of FGPAs and the PCBs were sent out to an assembly house that applied fresh solder balls to the FPGA, and reflowed soldered them successfully.
Then, we used the radio trick to verify that they were successful in getting all (or nearly all) good connections.
The ASIC design was completed, the parts were made and the company made several design wins in digital cameras.
Contributing Writer Charles Glorioso has a BSEE from Purdue and an MSEE from Illinois Institute of Technology. He has over 40 years experience in electronics design and management for industrial and consumer products. Employers have included Teletype Corporation, Cadence, The Exploratorium, at least 6 companies which no longer exist. Charles retired earlier this year after six years as Director of Engineering at Davis Instruments, and is now working there part time on special projects.