HOME  |  NEWS  |  BLOGS  |  MESSAGES  |  FEATURES  |  VIDEOS  |  WEBINARS  |  INDUSTRIES  |  FOCUS ON FUNDAMENTALS
  |  REGISTER  |  LOGIN  |  HELP
Comments
View Comments: Threaded|Newest First|Oldest First
Rob Spiegel
User Rank
Blogger
Watch out for Mother Nature
Rob Spiegel   10/4/2013 7:16:01 AM
NO RATINGS
This is great. The system fails because Mother Nature was not taken into account. Bravo for the engineer who could see through the technology to a natural solution.

far911
User Rank
Silver
Re: Watch out for Mother Nature
far911   10/5/2013 1:15:34 PM
NO RATINGS
TRUE

Battar
User Rank
Platinum
And nobody noticed...
Battar   10/6/2013 10:21:27 AM
NO RATINGS
You're telling me that for 3 months nobody noticed that the system "occaisionally crashed" only during a thunderstorm? And nobody noticed a link between "thunderstorm" and "electrical interference" ? There are only two possible explanations for this:

1) Your software guys were pure software, with no knowledge of hardware (not credible)  or

2) that your installation (like the PDP-11/VAX computer rooms I worked in when I wore a uniform) were situated underground.

Nancy Golden
User Rank
Platinum
Re: And nobody noticed...
Nancy Golden   10/6/2013 4:41:58 PM
NO RATINGS
As amazing as it sounds that they couldn't see the connection between a thunderstorm and a system crash...those software guys simply could not think outside of the box. They were so busy defending the performance of the system that they couldn't see the obvious. Sometimes people get tunnel vision and need someone from the outside to point things out. 

William K.
User Rank
Platinum
Re: And nobody noticed...
William K.   10/7/2013 5:59:12 PM
NO RATINGS
"Just one liitle error". That sort of problem can bring huge systems to s crashing stop, with far worse results than in this posting.

But what I didn't get was if the ignoring the receiver input was a hardware function not included, or a software function not switched on.

jgundie
User Rank
Iron
Re: And nobody noticed...
jgundie   10/7/2013 8:21:40 PM
NO RATINGS
It was a hardware function that was not implimented correctly.  I suspected the person who designed the circuit did the test verification that showed it worked correctly (:|) repeating a conceptual error.  The system had been well tested in CA without many problems.

jgundie
User Rank
Iron
Re: And nobody noticed...
jgundie   10/7/2013 8:16:04 PM
NO RATINGS

Nancy you made me think more about the problem. What's not said is that the data transmission often had errors caused by the lightning and CRC testing would catch them. Also I would guesstimate there could be over thousand hits a day ( a "single" bolt of lightning probably created multiple data hits). At 1ms per data packet there were almost 100 million packets/day so a 1000 packets a day being thrown out was not a flag of concern but an indication the system was working correctly.

With a 100 nS window of opportunity in a 1 ms time window that suggest probably only 1 out of 10000 hits could corrupt the CRC protection (note the lightning had to hit only the last 100 ns not before; if it hit before it would be detected and thrown out by the CRC detection).  That in turn suggests that only once every 10 to 100 days there would be a crash. As I recall a three week interval between crashes was an interval was once spoken too.  Also Florida was considered the lightning capital of the world (Congo beats them out) with Tampa recording 21,000 cloud-to-ground (Ju 93); cloud-to-cloud probably affected our system too.  For a perspective a bolt of lightning can exceed 50 KA and have rates of change of 40 KA/s.  The source voltage behind this gets very high.

Nancy Golden
User Rank
Platinum
Re: And nobody noticed...
Nancy Golden   10/7/2013 9:26:35 PM
NO RATINGS
Thanks for elaborating, Jim. As a test engineer, I have often ran into what some people would call obvious failures only to find that the issues were much more subtle - the obvious failure was merely a symptom of a much more complex issue that could be related to either hardware OR software. That is the challenge of electronics - the obvious answer is not always the correct one.

Ann R. Thryft
User Rank
Blogger
Re: And nobody noticed...
Ann R. Thryft   10/17/2013 6:44:27 PM
NO RATINGS
I find it mind-boggling that the engineers couldn't make the connection, and so many times. But sad to say, I have known many software people who don't seem, uh, connected to the physical world and how it works.

Nancy Golden
User Rank
Platinum
Re: And nobody noticed...
Nancy Golden   10/17/2013 8:00:20 PM
NO RATINGS
I think part of that depends on your discipline, Ann. As a test engineer it is critical to have a high awareness of both hardware and software operation...if you only think about one or the other you won't get very far. However, most folks do seem to be a bit better at one than the other - I guess that might be a function of how our brain works...while my husband and I do both hardware and software - he has more hardware expertise and I have more software expertise so when we do projects together he typically does the HW and I do the SW. So of course whenever there is a problem - it must be the HW :)

Ann R. Thryft
User Rank
Blogger
Re: And nobody noticed...
Ann R. Thryft   10/21/2013 1:04:00 PM
NO RATINGS
Thanks for that insight, Nancy. I agree about the mix of disciplines and awareness needed for test engineers. The same is clearly needed for engineers in charge of an operation like this one.

TJ McDermott
User Rank
Blogger
Re: And nobody noticed...
TJ McDermott   10/25/2013 2:22:33 PM
NO RATINGS
Any substantial building could buffer the occupants.  I'm minded of several I've worked in, from office buildings to concrete-roofed factories.

Mydesign
User Rank
Platinum
Natural considerations
Mydesign   10/7/2013 5:44:46 AM
NO RATINGS
1 saves
Jim, interesting experience. I think normally we won't account such natural things during the system design phase. This explains the necessity for considering such natural things.

jgundie
User Rank
Iron
Re: Natural considerations
jgundie   10/7/2013 8:37:53 PM
NO RATINGS
 

The system design spec was good and in this respect if it had been met there would not have been a problem.  The spec specified the digital data receiver inhibit the data input during the interrupt interval.  The hardware implimentation somehow missed doing what was specified although I believe the designer thought he/she? had met the reguirement.

William K.
User Rank
Platinum
Re: Natural considerations, and exceptions
William K.   10/7/2013 9:09:32 PM
NO RATINGS
Most systems that fail to allow for an exception will perform adequately, or even quite well, until that exception occurrs. Then there is a failure. If the system is robust enough there may be an automnatic recovery, otherwise a wander-off, or a crash. The crashnis what your system did, although it sounds like it was a "wander off then crash" mode. The challenge is, and has been, to handle the exceptions correctly. 

Nancy Golden
User Rank
Platinum
Re: Natural considerations, and exceptions
Nancy Golden   10/7/2013 9:31:47 PM
NO RATINGS
Great point, William - error handling can make a huge difference in system operation. Sometimes it takes awhile for a specific error to show up and then error handling code is introduced after the fact...it can be hard to anticipate all of the failure modes that are possible and to have code written up front to handle all possible scenarios. Windows OSs are classic examples of this concept!

William K.
User Rank
Platinum
Re: Natural considerations, and exceptions
William K.   10/7/2013 10:14:04 PM
NO RATINGS
Nancy, even beyond actual errors, there are exceptions, which may be perfectly OK, but beyond the realm of what the system was prepared for. All windows OS's are perfect examples of not being prepared or able to handle anything except what the program writers thought it should handle. And anybody who thinks differently than them is in for things not working "right".

Nancy Golden
User Rank
Platinum
Re: Natural considerations, and exceptions
Nancy Golden   10/7/2013 11:08:50 PM
NO RATINGS
And anybody who thinks differently than them is in for things not working "right".


Or for the blue screen of death...

William K.
User Rank
Platinum
Re: Natural considerations, and exceptions
William K.   10/8/2013 4:17:14 PM
NO RATINGS
Yes, Nancy, the "BSOD" syndome is one of the indicators that an attempted action was "outside of the realm."

Charles Murray
User Rank
Blogger
BSOD
Charles Murray   10/10/2013 3:18:28 PM
NO RATINGS
Ah yes, Nancy, the old blue screen of death. We don't seem to hear about that anymore.

btlbcc
User Rank
Gold
Florida Lightning
btlbcc   10/7/2013 2:19:19 PM
NO RATINGS
I read somewhere that Florida is the most lightning-active area in the USA.  I suppose one can get used to anything...  And apparently the computer crash didn't happen with every thunder crash, so it's understandable why the software guys didn't catch it as being a hardware problem.

Brooks Lyman

AnandY
User Rank
Gold
RE: Nobody Noticed
AnandY   10/8/2013 8:16:59 AM
NO RATINGS
It's unbelievable that for 3 months the Computer Programming guys never noticed that the computers crashed only during thunderstorms. Surely, they should have. It's funny how technology makes us forget about nature. This proves how much the two are largely related.

Charles Murray
User Rank
Blogger
Far-away lightning strikes
Charles Murray   10/10/2013 3:16:55 PM
NO RATINGS
Amazing story, Jim. Talking to engineers at Littelfuse, I've heard that lightning-strikes can cause problems from as far away as a half mile or more. The effects of lightning can really be surprising.



Partner Zone
Latest Analysis
Self-driving vehicle technology could grow rapidly over the next two decades, with nearly 95 million “autonomous-capable” cars being sold annually around the world by 2035, a new study predicts.
MIT’s Senseable City Lab recently announced the program’s next big project: “Local Warming.” The concept involves saving on energy by heating the occupants within a room, not the room itself.
The fun factor continues to draw developers to Linux. This open-source system continues to succeed in the market and in the hearts and minds of developers. Design News will delve into this territory with next week's Continuing Education Class titled, “Introduction to Linux Device Drivers.”
Dean Kamen tells an audience at MD&M East 2014 how his team created the DEKA Arm to meet DARPA's challenge to design a better prosthetic arm for wounded veterans.
The new draw-it-on-a-napkin is the CAD program. As CAD programs become more ubiquitous and easier to use, they have replaced 2D sketching for early concepting.
More:Blogs|News
Design News Webinar Series
7/23/2014 11:00 a.m. California / 2:00 p.m. New York
7/17/2014 11:00 a.m. California / 2:00 p.m. New York
6/25/2014 11:00 a.m. California / 2:00 p.m. New York
5/13/2014 10:00 a.m. California / 1:00 p.m. New York / 6:00 p.m. London
Quick Poll
The Continuing Education Center offers engineers an entirely new way to get the education they need to formulate next-generation solutions.
Aug 4 - 8, Introduction to Linux Device Drivers
SEMESTERS: 1  |  2  |  3  |  4  |  5  |  6


Focus on Fundamentals consists of 45-minute on-line classes that cover a host of technologies. You learn without leaving the comfort of your desk. All classes are taught by subject-matter experts and all are archived. So if you can't attend live, attend at your convenience.
Next Class: August 12 - 14
Sponsored by igus
Learn More   |   Login   |   Archived Classes
Twitter Feed
Design News Twitter Feed
Like Us on Facebook

Sponsored Content

Technology Marketplace

Copyright © 2014 UBM Canon, A UBM company, All rights reserved. Privacy Policy | Terms of Service