Most engineers know how to create a moving average. Just slide a "window" along your raw data and averaging the exposed values to create a new value for the central data point. That new value drops into a separate set of averaged data. Mathematically, you calculate a new value for central point x0 based on averaging a set number of values, n, to the right (x1, x2...xn) and left (x-1, x-2...x-n) of point x0.
But, if you acquire signals that include sharp peaks and valleys, this type of averaging can distort information, as shown in the accompanying graph. The upper graph shows Gaussian white noise added to progressively more narrow peak-and-valley data. The middle graph illustrates the effect of applying a moving average that processed 33 data points; the central point and 16 points on each side. The result shows noise reduction, but at the expense of distorting the narrower peaks and valleys. Researchers and engineers usually want information about a peak's position, height and width from this type of data. (This technique and the one that follows assume samples taken at uniform time intervals.)
In a seminal paper, published in 1964, Abraham Savitzky and Marcel Golay explained how to use a similar function, but with coefficients that preserve the underlying data. (A moving average uses a coefficient of 1 for each value.) They applied a least-squares polynomial fit to the n data points on either side of the center point, x0, and then used the polynomial coefficients to calculate an averaged value for that point. But this approach required a polynomial fit for each point in a set of data — a time-consuming approach in '64. So, Savitzky and Golay derived sets of coefficients that operate within a moving window and thus filter (smooth) the data with remarkably little effect on the underlying information. Table 1 provides "averaging" coefficients for three small window lengths.
The bottom graph shows the results of using a Savitzky-Golay (S-G) filter to the raw data in the top graph. In this case, an algorithm applied fourth-order coefficients to the data within a 33-point window. (Different sets of S-G coefficients will produce a first or second derivative.) So, if you must process data with noisy peaks, keep an S-G filter in mind. You can find this filter in software such as Mathematica, Origin, the Signal Processing Toolbox for MATLAB, and the MATRIX Advanced Filter Toolkit for LabVIEW.
Later researchers improved upon and corrected some of the original S-G coefficients, so to better understand the math involved or to obtain coefficients, refer to the original papers. I'll post these references in the Electronics/Test Forum.
|
|
|
|
| |
|
| |
Three graphs show (top) raw data, (middle) results of a moving average, and (bottom) raw data processed with a Savitzky-Golay filter. Dotted lines indicate the original data values. Courtesy of Saul A. Teukolsky, from Computers in Physics, Nov/Dec 1990.
|
|
|
|
|
|
|
| Table 1: Typical Savitzky-Golay Coefficients |
| Points (n) |
Filter Coefficients |
| 4 |
|
|
-0.0909 |
| 3 |
|
-0.0952 |
0.0606 |
| 2 |
-0.0857 |
0.1429 |
0.1688 |
| 1 |
0.3429 |
0.2857 |
0.2338 |
| 0 |
0.4857 |
0.3333 |
0.2554 |
| -1 |
0.3429 |
0.2857 |
0.2338 |
| -2 |
-0.0857 |
0.1429 |
0.1688 |
| -3 |
|
-0.0952 |
0.0606 |
| -4 |
|
|
-0.0909 |
Join discussions of this and other columns at the Electronics/Test Forum at: http://rbi.ims.ca/4924-539.