The Adventure of the Rebooting Computers
By Jean Donato, Contributing Writer
Putting an engineer on surveillance detail solved this perplexing case.
Back in the 1990s, I was a systems integrator for a major control systems manufacturer. These control systems used a combination of sophisticated in-house software, as well as custom and off-the-shelf hardware. The latter mostly included Unix servers and workstations.
The integration and testing was typically performed at the company premises. We had access to a large staging area, where a few multi-month (or multi-year) projects were being handled in parallel. Each project made use of dozens of these computers and workstations, manufactured by a well-known but now-defunct company.
Since many of these projects were very similar, this was routine work which usually yielded few surprises - except during one hot summer. In the course of staging one of these projects, the integrator found that most of its computers had unexpectedly rebooted overnight. Operating system error logs revealed nothing useful, and no other software would have been running on these computers during their reboot. The next morning, the problem was seen again. And so on, nearly every morning after that!
Suspecting a sinister motherboard or power supply hardware problem, the computer manufacturer was contacted. They swapped parts and more parts. Still, the reboots persisted. The AC power quality was monitored and no anomaly was found. The operating system was also updated and patched, but to no avail.
One fact was clear: Suspiciously, the reboots always happened around the same time every night.
Finally, the decision was made to park an engineer on the test site overnight, with the simple instruction to keep his eyes peeled. Dimly lit and very quiet, the staging area was devoid of any other humans. Our engineer must have thought this red-eye exercise was a major time waster. Then in the middle of the night, he saw the security guard, a recent hire, making his rounds. In doing so, he dutifully walked into every staging area, including the one in which the rebooting computers were located. As he walked in front of the row of workstations and servers, the powerful radio on his belt was a mere few inches away from their keyboards. Our engineer could not believe his eyes: As the guard moved in front of this row of keyboards, nearly every computer behind him rebooted instantly and silently.
The next day, it was confirmed: Simply waving the guard’s radio a few inches from one of these keyboards generally caused the connected computer to reboot instantly. The keyboard was branded and supplied by the same computer manufacturer.
Upon learning of this, the manufacturer theorized that the computers were affected by the walkie-talkie’s strong signal coming in through the keyboards, so they improved the shielding inside the plastic housing of the keyboard.
As an initial test, they applied a metalized shielding spray to the underside of the keyboard’s plastic shell. After it dried and was reconnected to the computer, it was verified that the system became immune to the guard’s powerful walkie-talkie, as it should have been in the first place.
Following additional and more rigorous testing, a new official revision of the keyboard was produced and this became the minimum revision that we could ship to our customers.
The case of the rebooting computers had thus been solved.
Jean Donato graduated in 1990 in Electrical Engineering, specialized in Computers, in Montreal, Canada. He has been working for almost 20 years in various industries, with positions related to project management, control systems engineering, worldwide customer support and information systems.
jjvman commented:
In the mid '70's, I worked in a small electronics manufacturing company. We sold some standard items, but also desinged and built custom equipment. Most of it was put into a larger enclosure (3' X 3' x 18" Hoffman typically). As our RFI test, we ran the system and put an old power drill near it and pulled the trigger. If anything unexpected happened, we re-examined the shielding design.
)
drgizmo commented:
Long ago when I serviced ekg machines in doctor offices, I was called in to troubleshoot an intermittent noise problem. The nurse showed me to the offending instrument and I connected my ecg simulator and verified there was definitely random noise. After the nurse left I disconnected my simulator and started to get my tool kit out. As soon as I had my tools out and stopped making noise I started hearing very faint voices coming from the ekg machine. When I bent over to hear better I found out it was actually singing Elvis' "Blue Suede Shoes". I was so surprised I did a sanity test and had the nurse come in and verify that she also heard the singing. I looked out the window and noticed an am radio staion next door. The ekg's galvanometer amp was doing a great job of detecting and amplifying the the top 40. Since they already had a radio I redid all their ground connections and eliminated that "feature" from their intrument.
Thomasj commented:
I work tech support for a company that produces intelligent motor starters. The first revision control board had an issue with this very issue, again found by a security guard walking past (a second hardware revision was subsequently produced). My favorite call was from a company that after nearly 15 years of perfect operation had most of it's motors less than 30hp (the sizes effected) turn off between 5:00pm and 6:00pm every day for almost a year. I asked if they had put up any radio systems, they had not, but there had been a cell phone tower built less than a mile away, and sure enough, it was put up almost a year previous. After a board swap, the problem ceased.
jkn commented:
At least IEC 61800-3 EMC standard for motor drives defines "walkie-talkie test" in Annex A3.2.2 as a complementary test for equipment that is too big to fit any test facility.
JP commented:
Indeed, radio interference is often misunderstood and not always taken into account at the design phase: assuming 'clear skies' is not always realistic! Since computers are now ubiquitous and may control health sensitive devices, better designs which may include redundant or backup systems seem essential.
Daryl commented:
As EMC consulting engineers, we have seen this problem many times. And yes, trunked radio systems and cell phones regularly transmit without our knowledge. An exciting case for us was a nuclear power plant that had experienced false alarms, indicating a nuclear radiation leak. Needless to say, there were some moments of panic when that occurred. We discovered that keying a guard’s radio near one sensor could cause the false alarm.
The good news is RF induced upsets have decreased significantly since 1996. At that time, the European EMC regulations mandated hardening most commercial electronic equipment to withstand RF levels of 3 to 10 Volts/meter. Prior to that time, we had seen failures at 0.1 Volt/meter or less. Thus, most equipment today is much more robust that pre-1996 equipment. More good news is that the fixes are often simple -- a few well placed 1000 pF capacitors and/or some ferrite beads often solve this problem right at the circuit board level.
harsh@neubauplan.com commented:
Fascinating, really, but hardly surprising. I have an HP wireless keyboard+mouse with my office desktop, and every now and then, the cursor starts creeping towards the left all by itself. It is irritating but not critical, but being a dyed-in-the-wool mechanical engineer, I can't figure out what's happening here. Any Sherlock to help me?
p852pck commented:
Back in mid to late 80's had a similar thing happen. sold CAD systems and one of my customers told me he was experiencing strange behavior on his computer. At times, he could wave his bare hands over the keyboard and random characters would be "typed" in. Well we swapped keyboards, of different makes and it kept happening but not in a reproducable way.
Finally, we noticed that when the person in the next cube was talking on the phone, we got the problem. Turns out he brought in a cordless phone from home. We were able to reproduce it everytime he used it. He was ordered to get it out of there immediately. Brings back memories of new computer users storing their important files on a floppy and sticking it to the side of the file cabinet with refrigerator magnets
David Z. commented:
I thought the story was heading toward the one I heard while working in a NYC hospital's medical electronics department: Only too often, it was told, the patient in the first bed near the door of a critical care unit would die during the night. The department found no problems with any life-support systems at that bed position. Finally, after putting a video camera in place, it was discovered that one of the cleaning staff would unplug our equipment from a wall outlet for about 15 minutes at night to clean the floor. -dz-
Skokian commented:
This is one of the test that we routinely did at a company for which I worked designing fire protection equipment and battery chargers for stand-by gen-sets. We also used cell phones and used portable drill motors to test for both radiated and line conducted transient suseptibility. Subsequent test to the Generic EN Norms for Immunity showed that the products could easily withstand the full battery of tests at double the rated levels.
McGyver commented:
RF always a strange animal... Back in my early Cable TV days (when we couldn't even fill the 36 channels we had!) we had an interference complaint from a non-subscriber that his neighbor's recently installed cable was interfering with his off-air reception . He "KNEW" this HAD to be true because the interference happened as soon as the neighbor would come home and allegedly turn on his cable. After trying to explain that it simply couldn't happen that way, performing an interference and leakage inspection (there wasn't any interference or leakage from the cable plant at all), the complaintant still would not accept the explanation and threatened to call the FCC... We finally agreed that we would briefly shut off the surrounding area's cable to prove that the cable wasn't at fault. When the neighbor came home the cable system power supply was shut off... and the interference stayed. The neighbor came out to see what the deal was and asked to have his set checked (he had a manual tuner that needed a bit of adjustment)... inside I noticed his police scanner set to the local PD frequency... I asked if it was on all day and he replied "No, just when I'm home"... He turned off the scanner and I checked with his neighbor... interference gone! I left it up to the two of them to work things out.
There was a report in the cable trade magazines of a similar incident, only the offender turned out to be the clock oscillator in the neighbor's car radio! It was an eary digital clock that was acually producing high enough RF to radiate several hundred feet!
Dave commented:
I once worked at a company where the programmers swore that the Zilog Z8000 CPU on a communications board would not execute a particular shift instruction correctly when the fan was running in the top of the equipment rack. It worked fine if the fan was unplugged. Just 1 instruction out of the whole instruction set! They worked around it by changing the code to accomplish the same result without using that instruction.
radio-active commented:
Legend of a similar tale, a mainframe in a then-major computer manufacturer in the 80's, would reboot at around midnight every night. Finally they posted a tech to watch. Turned out it was a cleaning lady, plugging in her vacuum into an outlet on the side of the mainframe. There was sufficient EMC conducted from the vacuum to the mainframe to throw it into a tizzy and cause it to reboot.
sx881663 commented:
Many years ago I was faced with a similar problem but it wasn’t caused by a radio. I was the person who had to babysit the system to see what was causing the problem. It would lock-up and reboot at 12:00 midnight but only usually after a hot day. The problem was tracked down to the clock timer software hanging up and this was due to a series of fast pulses coming in from the ac line. It seems that on a hot day the 60Hz would be pulled to something less due to air conditioner demand and the power company would send out correction pulses to jump all the clocks to the correct time. A heavy series of correction pulses as needed after a hot day would kill it.
The interrupt servicing routine was modified to be able to handle the correction pulses and the problem was solved.
HarWill commented:
Another twist: We had a development system that would fry the com board one night per week. Seems the Physical Plant folks had installed a new AC outlet for our use. They didn't connect the ground pin and wired the outlet to the building clock circuit! At midnight, the Clock Master Controller would disconnect the 120V AC and put a large DC current on the line to sync all the clock hands to 12:00. The lowest resistance path to ground was through the COM board, which fried the ground trace and, coincidentally, blew the fuse in the Master Clock Controller! It took about a week for the clock fuse to get replaced and then the cycle repeated. Only overhearing one of the Physical Plant guys complaining about the clock fuse led to solving the puzzle!
Alan3354 commented:
Why would a radio emit a signal when it's not transmitting? They all have a push-to-talk switch.
William Ketel commented:
This is amazing! That a small two way radio was actually radiating a detectable signal. Looking at the FCC rules about unintended radiators, it would seem that the radio was deffective, or, more likely, that there was a fundamental flaw in the keyboards. It is fairly well known that most cheap plastics are poor electrical shields, and that poorly designed equipment is excssivley sensitive to RFI.
BUT WHAT WAS THE GUARD DOING THAT CLOSE, in the first place??
bakatya commented:
Hand held RF receiver/transmitter units must comply with FCC standards for spec. minimum power emission while not transmitting. Perhaps the RF unit emission was out of tolerance. Unless the workstations and keyboards were mil spec. the shielding may not have been required and an un-necessary expense.
iNARTE EMC Engineer commented:
Oh yeah, radios cause all kinds of problems, even low power ones! Check www.lbagroup.com/Wireless_University.php for a bunch of case studies and anti-RF application notes.
Don commented:
As a system engineer and a Ham radio operator, I have found that a tri-band 5W hand-held is a very good early check of the suceptability of a system. If the system is fine at full power point blank to the system, rarely will it have RF suceptability problems when it goes through formal testing.
This quick check can be done early in the prototype stage when things are easy to fix. It is not guaranteed, but it is a pretty good indicator.
Frank Lambrecht commented:
The handheld/belt radio could have been keyed by the guard or it could have been a trunked radio wherein there is periodic handshaking with the base station. It works similarly to a cell phone which periodically transmits to the base station to stay registered.
Soon after I joined Motorola in Plantation, Florida, where we made pagers and portable (handheld) radios, we upgraded to a Dimension phone system. Two days after it went active, it died a horrible death. It seems one of the 2nd shift supervisors was walking by the telephone equipment room and used a portable radio (5W at about 450MHz) to speak with his second in command. When we got into work the next morning, EVERYTHING in the phone system was scrambled. There wasn't a single desk set which operated properly. Can you say "RF hardening?" Sure you can.
Frank
A ham commented:
I know that a radio’s local oscillator puts out a continuous signal, but 'most' of the time it is too weak to interfere with electronics. In this case, when the shielding was added to the keyboard case, it took the interfering signal to ground, thankfully. The timeline looks to be 10+ years ago.
I used VHF radios in the past to find bad shields in direct burial cables that were starting to get noisy. We would just have someone drive the line & transmit at each splice, while a tech was watching the cable’s signal on a spectrum analyzer. When we could see a spike, we had the bad splice. With hundreds of miles of cable, this was the most efficient method. It has now been replaced with fiber.
BV Engineer commented:
Fascinating. A classic case of EMC - ELectromagnetic Compatability. Even the spurious emissions of the radio were anough to cause a signal to latch in computers.
Red commented:
Hmmm, typically I have seen this happen when a VHF/UHF business band radio is transmitting so I am having a hard time understadning how a radio in receive only would generate a strong enough signal to reboot just walking by.
OhDRK3O commented:
Nothing New, If the engineer had any feild time, They would KNOW that RADIO's Put out wattage. Seen this way to many times, And always they blame the control manufacturer
Sponsored Content
Design News Partner Zones
Light Matters: The Unsung Heroes of
Modern Health Care
First, let's define "no-compromise." In an ideal configuration, this lamp would use a high-brightness LED (HBLED) that is built into a small, integrated package and is able to produce a large quantity of focused light, operate with a high level of reliability and generate no audible noise. Is this difficult? Yes, but it is possible.
Read More
Design Engineers' Portal for Sensing and Machine Safety
Whatever industry you're in, or whatever product you manufacture, the right sensors to automate your plant, and to improve your overall efficiency, quality and safety are a must. You'll find Banner Engineering to be an amazing resource of products, training and people with expertise.
Test & Measurement World Machine Vision & Inspection Report
Topics include machine-vision software, Power over Camera Link, thermal imaging and frame grabbers. Read More















