Welcome Guest from United States
Sign In Change Country
  0 Items
Search:
Algorithms - Discriminant Analysis, Preprocessing

Algorithms

Discriminant Analysis, Preprocessing

Many of the preprocessing techniques used for spectral quantitative analysis can also be used for discriminant analysis. For more information, refer to the Chemometrics Preprocessing Techniques section.

Another difficult task is deciding what spectral region(s) to include in the discriminant analysis. Generally, there are no proven guidelines for choosing the best regions and, for the most part, entire spectra work well. Remember that discriminant analysis is effectively a spectrum matching technique. Any information that is included in the training set spectra will be considered an "allowed" variation for unknowns.

There are, however, some general rules on regions to exclude. Any region which will not be consistent in future spectra should be removed from the data before performing the model calculations. Examples of some typical regions to exclude are regions below the detector/optical cutoff point, very strongly absorbing regions where non-linearity in the spectrometer response might occur (i.e., >2 Absorbance), regions of strong interference (i.e., water vapor in the mid-IR), or even regions where the signal-to-noise ratio is poor.

 

Figure 1. Training set spectra (top) from Figure 1 of the Mahalanobis section with the Standard Deviation spectrum (bottom). The large bands in the standard deviation indicate that most of the variation in the spectra is taking place in the 3 major bands at 1470 cm-1, 1380 cm-1 and 725 cm-1.

 

There are some techniques that can aid in determining regions where groups of spectra are most "different" from each other. In other words, locating the portions of the spectrum where the sample is varying the most from run-to-run. One method is to calculate a Standard Deviation Spectrum. Effectively this is accomplished by subtracting the average spectrum from every spectrum in a training set, and then calculating the standard deviation at every wavelength. Regions that have large positive peaks are regions where the spectra are varying and will contribute the most to the PCA factors for the group.

Remember that the purpose of using the Mahalanobis distance discrimination method is to calculate a model matrix that gives an allowed range of variation in the data. Choosing regions of little or no variation can result in a Mahalanobis group that is too restrictive and will not find any matching data outside the training set. However, the Standard Deviation Spectrum is merely an indicator of changes occurring between the spectra. As with selected wavelength models, a PCA -Mahalanobis model that includes only small portions of the spectrum is likely to miss important features (such as impurities) that can lead to misclassification.

Back to top