When I was a bright-eyed youngster, I worked for what was then probably the world's premiere desktop computer manufacturer. I had the pleasure of working with great people and solving hardware and software problems. My title was Quality Engineer, and thanks to my understanding and accommodating boss, I was the go-to guy for manufacturable printed circuit board ECO blue wire routing. I was often involved with component-level troubleshooting in manufacturing sustaining engineering, which was part of my QE scope.
Everything seemed to be going well with the introduction of a new computer terminal codenamed Jalapeno, which, like other dumb terminals of the era, displayed 24 80-character lines of text. Manufacturing engineering had done the usual great job setting up for production, and the simple, low-part-count design looked like a relief from our much more complex products. Once in production, however, yield was low due to random, intermittent memory errors.
The Jalapeno was designed around a Z80 processor and National's newly introduced DT8350 CRT controller. A Mostek MK4802 NMOS 2Kbyte static RAM provided firmware scratch pad area and the terminal's ASCII text store. Usually, the Z80 owned the address bus, but when a line of text was needed for display, an interrupt signaled the Z80 to take its data and address lines into high impedance, so the DT8350 could take direct access (DMA) control. During DMA, 80 bytes of ASCII text would be transferred from RAM to a National MM5035 80-byte shift register.
DMA interrupts occurred 24 times per display page. This permitted the Z80 to run about 90 percent of the time. Working in the sustaining engineering lab, we observed DMA transitions and found wandering address line levels floating during the transition interval after the CRT controller gave up control but before the Z80 took control. It also became apparent that only the Z80's first byte after a DMA transfer terminated was corrupt.
Thinking that perhaps a bit more transition time was needed for the address bus to stabilize, I made some modifications to stretch the transition time further. I was astonished to find that my changes made the problem much, much worse. Furthermore, replacing the Mostek NMOS RAMs with Hitachi HM6116 CMOS RAMs made the memory errors disappear altogether. It was time to bring in Mostek. I called Mostek and told it everything.
Mostek swiftly came back with the answer. The 4802's NMOS technology was a bit power hungry. The RAM was consequently designed to power down very quickly when it sensed address bus inactivity -- in our case, DMA transition. When the powered-down device saw address bus activity again, it would power up, toggling a very fast internal address latch that, in our implementation, grabbed an invalid, still-floating address bus.
Before the Z80 had pulled the address bus valid, the RAM was already locked to the wrong first byte. This 4802 address bus latch behavior was not apparent in the data sheet, so Mostek's speedy analysis and report to us was essential for our understanding the root cause of the problem. Rather than delaying manufacture and product delivery with board redesign, we replaced the 4802 with the 6116. Yield soared.
This experience showed me that, when your own efforts to understand a problem and its environment aren't enough, reaching out for additional expertise without delay can make the difference between good outcomes and slow, costly ones. It also underscores the value of good relationships with good vendors.
Jay B. Swindle does hardware and software consulting in the Dallas/Fort Worth area. He is especially interested in integrating hardware and software design, manufacturing, and manufacturing support involving knowledge and configuration management, quality, project planning, and bids and proposal systems. He studied computer science at Baylor University and is a private pilot.
Tell us your experience in solving a knotty engineering problem. Send stories to Rob Spiegel for Sherlock Ohms.