Engineers investigating a randomly resetting computer on an oil rig learn the perils of making assumptions
By John Linstrom, Contributing Writer
Years ago I worked for a company that made single board computers. They started out as an industrial grade product, then over time we ruggedized them, added daylight readable LCDs, and had them NEMA rated, FCC approved and HAZLOC certified.
One of our first customers wanted to use them as the HMI for a robotic oil-drilling engine. The operator would use our computer to control the engine that staged and coupled together drill pipe for the string, run and monitor the drill motor, drilling fluids and other parameters during the well penetration.
Customer service got an angry call from a drill platform in the Gulf of Mexico. Our equipment was randomly resetting and the whole drill station integration was delayed. Sign-off and the commissioning date were coming up soon. Delays and penalties were mentioned; the rig owner was on the bus to Crazyville, sitting by a window.
I was sent, along with a field service engineer from the robotics system integrator, to fix our problems on the rig. A similar system had been run at our factory under temperature and voltage stress and no problems were found so ship’s power was suspected. We brought along a constant voltage transformer the approximate weight and size of a small V8 just in case. Being hoisted 300 feet by cable to the rig deck was my first clue that this wasn’t just a “shake and howdy at the front door” service call.
The computer in question had MOVs across the power input and before we had a chance to try the C.V.T., an MOV blew - the noise and smoke from a V130LA20 got the crew twitching. AC power held steady at 120VAC/60Hz but ground to neutral voltage was all over the place. We scoped the input anyway; aside from the uncommitted neutral, AC IN looked clean enough. We forget the stinky MOVs and assumed that power was good.
The computer was in a control desk console on the drill floor, protected from wayward pipes by a batter’s cage on steroids. Right outside tall doors were the open-air pipe racks where additional lengths of drill pipe were stored. Power and I/O to the desk looked acceptable, but detaching external cables didn’t stop the resets. All our troubleshooting found no cause - our test software failed as randomly as the robotic system application code.
There was nothing we could do to make things better or worse. Unplugging peripherals, monitoring voltages, swapping hard drives, changing any available hardware made no difference. Door open, door closed, the computer reset when it wanted to. I was scoping an anomalous reset signal and started calling out its cadence, trying to think out loud in frustration.
“Keep calling out those resets!” the field engineer suddenly yelled. “Swap places with me and look out toward the rack bay while I call out the resets. Watch the ship’s radar.” Reset blips tracked the antenna sweep like the second hand on a clock except for a few times that chain hoists were in the way or the metal bay doors were shut tight.
The computer had a water-resistant door gasket (that was not conductive) and had gone thru EMI/EMC testing during compliance approvals - but not at the frequency or intensity of the ship’s radar. A conductive gasket inside the door seal was an easy fix to a show-stopping problem. A permanent fix was to spec a custom waterproof and wire mesh door
No one at either company had asked and the customer had not mentioned the user’s environment before hand since after all this was a ‘ruggedized’ PC. The customer wanted to use the product where he wanted, no excuses, thank you!
Contributing writer Jon Linstrom is a 1982 EE graduate from the Univ. of Kentucky. A design engineer, he does analog, low level circuit design.