The Adventure of the Buggy Debugger

DN Staff

August 26, 2009

4 Min Read
The Adventure of the Buggy Debugger

A noise issue is just the beginning of problems for a young engineer tasked with debugging a telephone switching system with a zillion relays.

By Bob Colwell, Contributing Writer

In the late 1970s I was fresh out of school, working at Bell Labs in their microprocessor design group. Back then, we debugged microprocessor-controlled equipment by using an in-circuit emulator (ICE), essentially a big, expensive box with a 6-ft cable that plugged into the microprocessor socket in place of the CPU chip. The New Jersey group I was working for had designed both the CPU and the in-circuit emulator, and a group in Chicago was trying to use our chip to control a Number One ESS (1ESS) — an enormous building-sized, telephone switching system comprised of ten zillion relays.

The complaint was that our ICE was randomly resetting and causing their debug effort no end of grief. NOTE: When you see the word “reset”, don’t picture the kind of silent reset a chip performs. When ten zillion relays all clack at once, it sounds like the world is coming to an end. Or, possibly, the dead are waking up. Come to think of it, I was myself entering into a kind of horror show of my own at this point.

I went to Chicago, and sure enough, the ICE was resetting. But why? After studying the problem for a few days, I realized that the resets happened more often at night than during the day. One fortuitous evening, I happened to be looking at the monitor for the ICE when it reset. At the same time the lights switched off, and out of the corner of my eye I could see a shadowy figure disappearing through the open door.  I restarted the system and turned the room lights back on. The ICE thunderously reset again.

It turned out that the power supply we were using had very little noise immunity to the hash imposed on the AC lines when fluorescent lights were turned on or off, and the supply passed the noise right on through to the logic. We gleefully drop-kicked the supply into the garbage can and replaced it with a better supply. Problem solved…well, sort of.

The random resets were gone, but perhaps that wasn’t surprising as the 1ESS was now not working at all. My scope was showing that many of the signals at the ICE socket looked terrible, with an impressive showing of noise and spikes everywhere. I looked at the ICE end of the cable, and those signals didn’t look very good either. With the power supply issue fresh in my mind, I put the scope probe on the +5V backplane and got a real surprise — it was drooping by over a volt from one end of the backplane to the other! Worse yet, the ground was not at 0V at the far end of the backplane; ground was about 1V higher at the far end than at the supply connection.

It turned out that there was no real plane inside the backplane: +5V and ground were just fat traces on the surface, and not nearly fat enough to prevent a serious IR drop. And this was just the DC component; heaven knew what inductive spikes were being produced across both +5V and ground, and I shuddered at the thought.

Redesigning the backplane fixed those problems and, now predictably, exposed the next one. The ICE socket signals still didn’t look right. In fact, some of them didn’t even look digital. Some thinking and observation resolved this one: The 6-ft flat cable had been designed with the clock running between two data signals, and the data signals had been interspersed with the address bits. There was so much crosstalking happening in that cable, the signals might as well have been connected directly together. I reallocated the signals, and used a new (expensive) cable with a ground shield around everything.

The ICE now worked, just in time for the customer to announce that they were switching to another vendor’s microprocessor. The Dilbert comic strip will never run out of ideas.

Contributing Writer Bob Colwell was Intel’s chief x86 architect in the 1990s and has worked as a computer designer at VLIW pioneer Multiflow, Perq Systems, and Bell Labs. Author of The Pentium Chronicles and the At Random column in Computer Magazine 2002-2005. He is currently an independent consultant. 

The Adventure of the Misbehaving Serial Port

Test your own investigative skills by following the clues to deduce what’s plaguing a data-acquisition system returned for repair in this diabolical engineering mystery. Click here and start sleuthing away.

Sign up for the Design News Daily newsletter.

You May Also Like