I have to agree that for other than the most straightforward calculations, Excel is not the way to go. I have had two or three bad experiences with MS Excel and use other software programs where needed. For statistical work, I use Minitab exclusively. I feel it's the very best application software for doing six sigma determinations. Excellent article Brent.
Mr. McDermott, thank you for bringing this to our attention. Per Chad's post, he developed his thoughts and e-book independently. He approached us about licensing the content, which PTC agreed to. This relationship is disclosed in the e-book itself. I didn't mean to mislead anyone by not fully disclosing the relationship directly in my original post.
Mr. Jackson, I have absolutely no problems at all with your credentials.
I was not clear enough when I questioned "the source".
My problem is with a guest blog that is in actuality an advertisement for the author's company.
I pointed out the host-location of your e-book because it strongly implies to me that PTC paid for the research and book (thus owning it). YOU note the licensing of it, but Mr. Edmonds of PTC did not. the way he presented you as author showed absolutely no linkage at all.
Good point, TJ. To use your analogy, a nail that's bent in some insignicant way could still be a functional nail. But if we are unable to recognize the difference between a significant problem and a small one, that's where it becomes problematic. The question I obviously don't the answer to is, how many of those 91% would be signicant problems?
I have avoided using excel for exactly the reason that it is such a huge pain to enter equations and it does seem to be quite user hostile. What is useful is an actual manual on how to make things work, not some useless help file that never ever thinks in the same ways that I think.
A few things that helps cut down on errors in spreadsheets:
(1) clearly show which values are inputs. I use bold text in yellow boxes.
(2) don't try to jam all of your expression in one box. just because you can, doesn't mean you should. It is just as easy to break a complicated expression down into parts or even do it in seperate steps. By taking on your calculations in steps using seperate cells you can have a better chance to see where things get honked up along the way.
(3) run test cases before you do all of your array. some of the techniques used in earlier code writing still are relevant in spreadsheets.
just some thoughts, there's better ways of using spreadsheets. Mathcad is a better way to visualize. Use whatever you think is best.
I've used Mathcad since version 4.0. I've also used MS Excel spreadsheets for many years for most if not all of my engineering calculations.
Garbage in, garbage out. Errors, errors, everywhere. Good engineering is being cognizant and meticuluous in every thing you do. What else can be said? It comes down to how easy it is to set up your calculation structures and how easy it is later to evaluate them later. Then, there's the issue of what kinds of calculations you want to do.
One of the great advantages of Mathcad (now belonging to PLC) is that it is a very visual means of setting up your calculation. The use of standard mathematical script and structures makes understanding the easiest of any other means I know. I use Mathcad as a test calculation method if some result is coming back wild and fishy. Sometimes you just need a hand calc. But, this is where the advantages of Mathcad stop. Early versions were difficult to arrange complicated setups, newer versions are somewhat better (I use Mathcad 14 now). It is not easy to learn how to make the structures look like they do in mathematics, no matter what the sales people say. I've got years with the software and I understand its limits.
Spreadsheets have one inherent advantage over Mathcad. It is FAR EASIER to make hundreds of parallel calculations. That is the main use of Excel. Lots of calculations over a wide array, easy to see the results, but the guts of the operations are not so easy to ferret out. Excel has added a lot of functionality in recent years including adding more math functions and GoalSeek which added a missing dimension to engineering calculations with spreadsheets. I miss having the Visual Basic editor to get my own solver routines to work in the background. Recent changes mistakenly called improvements have honked this up.
If Mathcad is offering their cut-down trial software for free in perpetuity, take it. Free tools are free tools. Mathcad is neat, but they need to work out their interface (clearer directions, tutorials) to have a chance against the raw simplicity of a spreadsheet. Mathcad has a host of weaknesses in that its not easy to find out why something is not working when you try to set up the formula. Mathcad has a long road ahead if it wants to challenge spreadsheets.
I fully agree that a lot of calculations entered into spreadsheets contain mistakes. There are many points raised by the previous comments that don't clarify things too much. From my view:
1) 22 spreadsheets is a small sample. In only one of the studies cited at the link given in the article are more than 30 samples done (note--there are two "interviews" cited which note approximately 36/year samples audited; however, the references are anecdotal. The point is only one study had a large number of samples and in that sample the error rate reported was much lower (11%).
2) The sampling and measurement methodology are considerably different across the studies so the aggregate statistics are useless.
3) I agree with the comment that spreadsheets are like software. Internal software is frequently not rigorously validated either, whether it is for engineering or business purposes.
4) The article does not consider errors in input to "validated" code. In my egineering experience, frequently there are errors made in setting up complex problems to use commercial code. Just becuase the software is "validated" does not mean it can validate all inputs. Garbage in, garbage out still applies.
5) The comment on errors in the Excel software are referring to a completely different issue; although serious, they are out of place relative to this article.
6) Again, agreeing in principle that errors in spreadsheets are commonplace, and sometimes serious, I still must point out that the type of calculations carried out by many codes sold by PTC are far beyond what is done in spreadsheets. Thus, this is not a realistic comparison.
7) Spreadsheet programs are as much a tool as a pencil and paper, a calcualtor, slide rule, abacus, or anything else. They are very useful in skilled hands. As I was told once, "a poor craftsman blames his tools".
8) The comment about checking test cases is right on. Most problems put into spreadsheets have easily defined limiting cases (certain values go to zero or one, etc.) and test cases should be run. The beauty of spreadsheets is that you can store the test case data and outputs with the code for posterity, even in a sheet in a workbook. This is a very convenient way to maintain validation. For comparison (and fun) ask a software vendor to show you their validation plan, their test cases, their results, all tied to code version. Go grab some coffee as it is then going to be a while. The point of this is that if you don't get clear test cases from a vendor, and verify you are running the same code, then you should run test cases through commercial engineering software to ensure it does what you think it does, or you are no better off than you were with spreadsheets.
I've been bitten by flawed spreadsheets more often than by other analytical software tools, but what makes me particularly suspicious of spreadsheet based calculations is that they don't do unit checking and that, due to not making calculations visible by default, they can be easily 'broken'. Additionally MS doesn't always follow the standards about handling extremely large/small numbers and NaNs.
The analytical process I've developed is three tiered approach:
For basic calculations that involve developing a simple correlation I default to using spreadsheets. I limit these analyses to simple plotting, low order curve fitting and basic statistics. I make a point of limiting the amount of complexity in any given calculation, show every intermediate result and make liberal use of named fields so the equations look closer to what I'd write on paper.
More sophisticated analysis that can be well approximated by closed form solutions are conducted with tools such as MathCAD.
The balance of calculations I generally conduct in Matlab or Octave. These tools don't provide units checking and may not be as readable as something like MathCAD. To address this I've adopted a lot of techniques used in software coding standards:
Variable names are constructed to communicate units, whether they are constants, their scope in the context of the analysis, etc.
Calculations are formatted to maximize readability.
Every step in the process is ether made obvious by the preceding bullets or is liberally commented.
My error rate has dropped precipitously since adopting this process.
The first Tacoma Narrows Bridge was a Washington State suspension bridge that opened in 1940 and spanned the Tacoma Narrows strait of Puget Sound between Tacoma and the Kitsap Peninsula. It opened to traffic on July 1, 1940, and dramatically collapsed into Puget Sound on November 7, just four months after it opened.
Noting that we now live in an era of “confusion and ill-conceived stuff,” Ammunition design studio founder Robert Brunner, speaking at Gigaom Roadmap, said that by adding connectivity to everything and its mother, we aren't necessarily doing ourselves any favors, with many ‘things’ just fine in their unconnected state.
Focus on Fundamentals consists of 45-minute on-line classes that cover a host of technologies. You learn without leaving the comfort of your desk. All classes are taught by subject-matter experts and all are archived. So if you can't attend live, attend at your convenience.