HOME  |  NEWS  |  BLOGS  |  MESSAGES  |  FEATURES  |  VIDEOS  |  WEBINARS  |  INDUSTRIES  |  FOCUS ON FUNDAMENTALS
  |  REGISTER  |  LOGIN  |  HELP
naperlou
User Rank
Blogger
Configuration Control
naperlou   9/5/2013 8:52:59 AM
NO RATINGS
Jason, configuration control and documentation are critical to ensuring that a design is correct and can pass from prototype to production.  When "patches" go undocumented, as you discovered with the evaluation board, then it becomes impossible to correctly configure the system.  Software has dealt with this situation for some time.  Frankly, I always thought hardware did as well.  Just goes to show...

jwevans
User Rank
Iron
Re: Configuration Control
jwevans   9/6/2013 2:20:03 PM
NO RATINGS
I think many of the problems I encountered were much more than configuration control issues.

The company produced (and still produces) immensely complicated processor chips and other supporting silicon. It also produced (and still produces) motherboards, server blades and other subassemblies for retail label manufacturers. The period of when I was working for this company was during the 'Internet Boom' of the late nineties early aughts. The company was making acquisitions left and right in an attempt to become a major player in the internet hardware space. While the company had execellent configuration control tools and systems in place, the adoption and use of these tools and systems by these newly acquired entities proved to be uneven at best. The company also had a difficult time fostering a culture of cooperation amongst many of its new acquisitions.

The division I worked for had been a medium sized company in a shrinking niche of internet technology. (I joined after the company had been acquired.) There was alot of bad blood between the hardware and software groups which sadly was not properly addressed after the acquisition. The division which produced the probematic processor had been part of a once great technology company which was in steep decline. Morale issues as well as a culture clash with the new owners lead, I believe, to communication failures.

This is as much a story of a business trying to rapidly enter a market by buying as many components of that market as possible and failing to integrate them as it is a story about insufficient or missing configuration control.

laser_scientist
User Rank
Iron
Human nature
laser_scientist   9/6/2013 8:43:53 AM
NO RATINGS
Actually, I have seen this phenomenon in various other development situations, as well, except that you could replace "firmware team" and "hardware team" with "hardware module 1 team" and "hardware module 2 team". I think it's human nature (at least for the engineer/scientist humans) to make up your mind what the problem is, and then proceed as though your notion is the truth -- until proven otherwise. Which is exactly what the hardware engineers did in this case.

 

tekochip
User Rank
Platinum
Difficult to determine
tekochip   9/6/2013 8:57:40 AM
NO RATINGS
It's very often difficult to determine if a problem is caused by hardware or software.  I had a problem with an NXP processor that would crash whenever the I2C interface was turned on.  It sounds like software, but when all other tasks were halted and only the I2C ran, everything was fine.  OK, maybe I'm running out of execution time?  No, no problem there either.  NXP eventually sent out an errata that the Vdd bond wire in the chip had too large of a voltage drop and would crash when the processor was pulling a great deal of current.


Jim_E
User Rank
Platinum
Great find
Jim_E   9/6/2013 9:07:25 AM
NO RATINGS
That's a good story!  (and by a fellow Evans too...)  I like how just noticing that it took a slightly longer time to fail from a cold start gave you the idea of how to narrow down the problem.  I've never worked for a super-large company with all of those divisions, but I see how there could be communication issues.

At least all of your hardware came internally.  I recall dealing with a hardware problem (not as severe as yours) when writing firmware, but the hardware was from an outside supplier who refused to accept that they had a problem.  We eventually had to write a workaround in software.  The old "can't you guys fix that in the software" solution.

 

jwevans
User Rank
Iron
Re: Great find
jwevans   9/6/2013 12:32:34 PM
Thank you. There were many communication issues within the company. My group found out later that another division was using the exact same processor and was aware from the start of the memory timing issue. I don't know if that team was told about the problem or had discovered the modifications on the evaluation board before spinning their hardware. Ironically, the company prided itself on it flat management hierarchy leading to efficient communication.

garysxt
User Rank
Iron
HW vs SW
garysxt   9/6/2013 11:56:20 AM
NO RATINGS
In the projects I manage I have a rule that both software and hardware are guilty until proven innocent.  Both groups are expected to work together to solve the problem. The SW guys can dismiss the HW team or vice versa if they feel the ball is in their own court, but one group can't claim the problem is on the other side and walk away on their own.

In the first week on this job I overheard some software guys discussing a problem. I went over an offered to help (it sounded like possibly a couple of shorted address lines). It turned out to actually be a software problem, but I gained a lot of respect from that incident. They were used to both sides pointing fingers at each other.

jwevans
User Rank
Iron
Re: HW vs SW
jwevans   9/6/2013 11:15:17 PM
NO RATINGS
Shortly after I started at the company, the project manager asked me to sit in on a hardware design review. When I walked into the conference room, I was met with expressions of suspicion and contempt from the members of the hardware team. Apparently, software folks were not supposed to attend hardware design reviews. This was the complete opposite of my previous employer where hardware and software worked together from product inception to release. This tight working relationship, I believe, helped us avoid many integration problems. It also allowed us to build some extra "margin" into the design to deal with unexpected hardware or software problems.

 

William K.
User Rank
Platinum
Is it hardware or is it software?
William K.   9/6/2013 8:16:30 PM
NO RATINGS
Years ago whenb co,puters came in all kinds of flavors we worked with a rather enlightened software house. And of course there did arise failures to operate correctly. OUr agreement with the software folks was that as soon as we could adequately describe the problem to them, they would start looking as we were looking, and whoever found the problem immediately phoned the other to halt their searching effort. Thus the time wasted was minimal, and no fingers were ever pointed until after the project was successful And we did point out the faults in detail so that we could all learn from them. So all of us got better at what we did. And the wasted effort was kept to a minimum.

Larry M
User Rank
Platinum
Deja vu all over again
Larry M   9/10/2013 4:58:13 PM
NO RATINGS
In 1995 I was leading design of a PCMCIA modem based on a reference design and reference firmware. We made some modifications for safety, EMC, and to improve performance of the analog front end which weren't related to the problem we encountered.

When we got prototype boards delivered we installed the latest firmware from the chip manufacturer. Once in a while the cards would boot completely but mostly they would hang when installed in a particular type of computer. The chip manufacturer would admit to no problems. I puzzled over this for a couple of weeks. One day, while taking a break from the lab to catch up on other tasks, my manager dropped by and asked "Why aren't you in the lab? I want you in the lab full time until this problem is solved. What is the problem, anyway?"

We went to the lab and I was going to demonstrate the problem--and the card booted successfully. I immediately powered it off and restarted it--and it hung, but exhibiting a failure mode I had not previously seen. I immediately restarted it and if failed in the usual manner--several successive times.

This was my clue! I asked the manager to stay right there, walked to the next room and returned with a can of freeze spray. I heavily sprayed the modem chipset and the card booted successfully several times. At last I had identified a way to make the problem come and go. Definitely a race condition of some sort, but it wasn't even clear whether the cause (or the cure) was hardware or software.

Since the manufacturer of the chipset declined to release source code I had to get him to admit to the problem and address it. The first step was to send him a computer where the problem occurred. Then I had to get him to actually look at it. Ultimately this took the threat of "If we don't hear from you by Friday, we'll be on your doorstep Monday morning to help you with it." Late that Friday night I got a call at home.  "Don't come! Don''t come! We've fixed the problem and are sending you a software fix." It turned out that one of the initialization steps was to read an unused bi-directional I/O register. Unfortunately the default power-on state for this register was high-impedance with the internal pullup resistor disabled until the instruction after the read instruction. The fix was to reverse the sequence of these two instructions.

Reading your experience brought back memories...

Ann R. Thryft
User Rank
Blogger
HW/SW integration is key
Ann R. Thryft   9/16/2013 1:27:29 PM
NO RATINGS
Thanks, I enjoyed this one a lot. I once worked in marketing at FORTH Inc., which designed the version of chipFORTH used as the RTOS/HLL/compiler/IDE for Federal Express' first handheld tracking device. I was amazed to discover that code could directly influence power usage and management in the HW. HW/SW integration is key.

Cabe Atwell
User Rank
Blogger
Re: HW/SW integration is key
Cabe Atwell   10/23/2013 6:09:48 PM
NO RATINGS
I've had a similar issue with my desktop, after hours of trying to figure out why it would crash after start-up (testing every piece of hardware) it turned out to be a memory timing issue. Just goes to show that whenever there is a problem, all possibilities should be considered.

Ann R. Thryft
User Rank
Blogger
Re: HW/SW integration is key
Ann R. Thryft   10/23/2013 7:14:39 PM
NO RATINGS
Cabe, that reminds me of a messed-up system clock due to an obscure faulty chip (I forget what it was--not something obvious like memory or processor), which was causing very odd system behavior on my laptop. We went through tons of trouble-shooting routines before an online forum gave us the clue.

Me1anie
User Rank
Iron
Digital Capacitors
Me1anie   11/4/2014 9:24:44 AM
NO RATINGS
About a year into designing digital electronics,( 8080 processors, 7400 logic, & such), I was in awe of one of my peers on the size & complexity of the box he was designing.  Fact was, I was jealous he had such a large design to himself.  With such a large box he worked lots of long hours & weekends getting his box to work.  With so many long hours & weekends, mgmt thought he was the best.

He sold off his prototype unit to the Operation group & then he left the company.  So Operations started replicating his boxes & they didn't work.  I was selected to resolve the problem.  It didn't take but one look on the backside of his wirewrap boards to see numerous capacitors wire-wrapped to the wirewrap posts.  Looking at his schematics & observing how those caps hooked into his circuits revealed he was using those all over the place on his timing & latch lines to control race conditions.  I spent the next month thereabouts redesigning his circuits so that those "digital capacitors" weren't needed.

About 2 yrs later I inherited another situation like this but in firmware.  Once again I was in awe of this engineer's long hrs & diligence.  He left & I inherited his mess too.   But rather than go into his mess, what I really learned from these two experiences was that just because someone puts up a heroic front & long devoted hours, etc, etc,... doesn't make them a "super engineer".  It could just be they're in over their head & scrambling.  Both of these guys were too proud or embarrassed to admit they didn't know something & ask someone for help. 

Me1anie
User Rank
Iron
Digital Capacitors
Me1anie   11/4/2014 9:25:18 AM
NO RATINGS
About a year into designing digital electronics,( 8080 processors, 7400 logic, & such), I was in awe of one of my peers on the size & complexity of the box he was designing.  Fact was, I was jealous he had such a large design to himself.  With such a large box he worked lots of long hours & weekends getting his box to work.  With so many long hours & weekends, mgmt thought he was the best.

He sold off his prototype unit to the Operation group & then he left the company.  So Operations started replicating his boxes & they didn't work.  I was selected to resolve the problem.  It didn't take but one look on the backside of his wirewrap boards to see numerous capacitors wire-wrapped to the wirewrap posts.  Looking at his schematics & observing how those caps hooked into his circuits revealed he was using those all over the place on his timing & latch lines to control race conditions.  I spent the next month thereabouts redesigning his circuits so that those "digital capacitors" weren't needed.

About 2 yrs later I inherited another situation like this but in firmware.  Once again I was in awe of this engineer's long hrs & diligence.  He left & I inherited his mess too.   But rather than go into his mess, what I really learned from these two experiences was that just because someone puts up a heroic front & long devoted hours, etc, etc,... doesn't make them a "super engineer".  It could just be they're in over their head & scrambling.  Both of these guys were too proud or embarrassed to admit they didn't know something & ask someone for help. 



Partner Zone
Latest Analysis
A bold, gold, open-air coupe may not be the ticket to automotive nirvana for every consumer, but Lexus’ LF-C2 concept car certainly turned heads at the recent Los Angeles Auto Show. What’s more, it may provide a glimpse of the luxury automaker’s future.
Perhaps you didn't know that there are a variety of classes, both live and archived, offered via the Design News Continuing Education Center (CEC) sponsored by Digi-Key? The best part – they are free!
Engineer comic Don McMillan explains the fun engineers have with team-building exercises. Can you relate?
The complexity of diesel engines means optimizing their performance requires a large amount of experimentation. Computational fluid dynamics (CFD) is a very useful and intuitive tool in this, and cold flow analysis using CFD is an ideal approach to study the flow characteristics without going into the details of chemical reactions occurring during the combustion.
The damage to Sony from the cyber attack seems to have been heightened by failure to follow two basic security rules.
More:Blogs|News
Design News Webinar Series
12/11/2014 8:00 a.m. California / 11:00 a.m. New York
12/10/2014 8:00 a.m. California / 11:00 a.m. New York
11/19/2014 11:00 a.m. California / 2:00 p.m. New York
11/6/2014 11:00 a.m. California / 2:00 p.m. New York
Quick Poll
The Continuing Education Center offers engineers an entirely new way to get the education they need to formulate next-generation solutions.
Jan 12 - 16, Programmable Logic - How do they do that?
SEMESTERS: 1  |  2  |  3  |  4  |  5  |  67


Focus on Fundamentals consists of 45-minute on-line classes that cover a host of technologies. You learn without leaving the comfort of your desk. All classes are taught by subject-matter experts and all are archived. So if you can't attend live, attend at your convenience.
Learn More   |   Login   |   Archived Classes
Twitter Feed
Design News Twitter Feed
Like Us on Facebook

Sponsored Content

Technology Marketplace

Copyright © 2014 UBM Canon, A UBM company, All rights reserved. Privacy Policy | Terms of Service