Recently, one of our production people noticed a slight idiosyncrasy with a mature product that has been shipping with no trouble for years. The design has an embedded controller that is capable of recording and playing back digitized speech messages. Under reproducible conditions, unlikely to occur in the real world, the message could repeat itself several times.
I knew of this issue when writing the embedded code years ago, but I brushed it off as unimportant because it was unlikely to happen. Besides, if it did, the consequences were just not that big a deal. But, in the interests of aesthetics -- and because I was a little bored at the time -- I decided to modify the code to remove the problem. The fix consisted of a simple, small piece of keyboard-scanning code that had to be added. It took 10 minutes to implement.
Unfortunately, my mod was nearly inert; it only partially fixed the problem. The repeating message could still be reproduced under some conditions. When dealing with a relatively complex fix, it is not so disconcerting when it doesn’t go as planned, but this was very simple. It should have worked. I put the hardware on an emulator running on my laptop to try to catch the point in the code where the error occurred. I could see what was happening, but I couldn’t offer an explanation as to why.
So, I threw a logic analyzer into the mix. To catch the error, I had to define pretty complicated trigger conditions. Condition A -- complicated on its own -- had to occur once, then condition B -- also complicated -- had to immediately follow and occur four times simultaneously. Then, the answer fell into my lap. As is often the case for my embedded designs, I doubled-up on some of the I/O lines to the (PIC16F505) controller. One line was used to control two different components.
One component needed the idle state of the line to be high, while the other required it to be low. This would not have been a problem if I managed it well. I didn’t. The control line was left high by one subroutine, but when another subroutine started up, it assumed the line would be low. This meant that I missed generating an all-important rising edge. The fix was super simple; just reset the line at the beginning of the second subroutine to zero.
In the interest of cost, engineers often double-up on I/O lines, giving them multiple functions, sometimes involving both input and output. In a perfect world, it would be nice to have one I/O line per function, but that’s a rare luxury. It does feel good to be efficient by saving resources, but it comes with the price of due diligence.
This entry was submitted by Jonathan Eckrich and edited by Rob Spiegel.
Jonathan Eckrich has been president of Adaptivation since 1998. Much of his job experience is with designing industrial (VME-based) computer systems. He holds a Master of Computer Engineering (1985) and a Bachelor of Computer Science (1982) from Iowa State University.
Tell us your experience in solving a knotty engineering problem. Send stories to Rob Spiegel for Sherlock Ohms.
Very nice fix; I like your troubleshooting methods and the way you tracked down the glitch. Your example is a valuable lesson for embedded designers who almost always have to figure out concessions with I/O.
I also like how you stayed on task and kept digging deeper and deeper until you fixed the root cause, rather than just put a patch on it. Good job and great commitment to quality.
Good point, Greg. These Sherlock Ohms stories are famous for showing how design engineers have to dispense with all assumptions and dig into areas that could easily get overlooked. If you have any of your own Sherlock stories, please send them along to: rob.spiegel@ubm.com
This is all over my head so I am way out of my comfort zone here, so if this question does not make sense just consider the source. 99% of what I do is mechanical and when I complete a design and the device is built, it is subjected to a run-off, often with the customer present. If changes are made, the drawings get updated and the alteration is logged. That way if I run into a similar design, I have a record of what did not work and how we corrected it. Is that done in your field as well or would someone else have to go through the whole trouble shooting procedure you just spelled out?
One of my co-workers is so tunnel visioned that he thinks the only important thing is to make the device work. As a result, many alterations may occur with the only record being what he retains between his ears. It drives me crazy, but his family owns the company, so I deal with it.
Tool_maker, In a perfect world, we would document everything everytime, but it's not always practical. In a large organization, thourough documentation is necessary to handle the logistics of communication. In a small mom-and-pop shop, you can get away with a certain amount of "mental" documentation because direct communication with those who know the details is easier.
This can be taken to extremes on both sides of the continuum. I used to work for a huge multinational conglomerate. We documented ourselves to death, literally. It took seven signatures and half a day to get an ECO approved and documented. That's IF I walked it through myself. The normal lag was about two weeks.
On the other hand, if you lose your human capital who happens to be the sole keeper of odd knowledge for a given project, good luck making heads or tails of it.
You have to find that balance of productivity and proper documentation with which you are comfortable.
From Dell / Intel® New Paradigms in Design Work Scott Hamilton, vertical market strategist for Dell Precision workstations, 5/2/2013 5
Early in my career, I worked as a draftsman and remember the days of drawing on vellum with numbered pencils and Mylar with plastic lead. This was a fun experience in the sense that I ...
I've been using workstations for more than 10 years and love finding ways to get more performance from my system. With demanding professional applications that require more power each ...
A lasting memory from my first job as an engineer in an auto assembly plant is standing on hard concrete at six in the morning, vending-machine coffee clutched in hand, listening to ...
For industrial control applications, or even a simple assembly line, that machine can go almost 24/7 without a break. But what happens when the task is a little more complex? That’s where the “smart” machine would come in. The smart machine is one that has some simple (or complex in some cases) processing capability to be able to adapt to changing conditions. Such machines are suited for a host of applications, including automotive, aerospace, defense, medical, computers and electronics, telecommunications, consumer goods, and so on. This radio show will show what’s possible with smart machines, and what tradeoffs need to be made to implement such a solution.
To save this item to your list of favorite Design News content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.