Welcome Guest from United States
Sign In Change Country
  0 Items
Search:
Algorithms - Beer Lambert Law, Classical Least Squares (K-Matrix)

Algorithms

The Beer Lambert Law - Classical Least Squares (CLS)

This spectroscopic quantitation method, also known as K-Matrix, is founded in using the  Beer_Lambert Law to extend the calculation of the absorptivity coefficients across a much larger portion of the spectrum than the much simpler  Least Squares Regression method. Referring back to the description of Beer’s Law, notice that it defines a relationship between 4 different variables; the spectral response (), the constituent absorptivity constant (), the pathlength of light (b) and the constituent concentration (C). The goal of calibrating a spectroscopic quantitative method is solving for the absorptivity constants. However, if the pathlength of the samples is also kept constant (as it is for most quantitative experiments), Beer’s Law can be rewritten as:

z2

where the absorptivity coefficient and pathlength are combined into a single constant, K. This equation can be easily solved by measuring the absorbance of a single sample of known concentration and using these values to solve for . Predicting the concentration of an unknown sample is as simple as measuring the Absorbance at the same wavelength and then applying the following:

However, basing an entire calibration on a single sample is generally not a good idea. Due to limitations of noise, instrument error, sample handling error, and many other possible variations, it is best to measure the absorbances of a series of different concentrations and calculate the slope of the best fit line through all the data points. Just as in the case of Least Squares Regression, this can be solved with a simple regression line of absorbance versus concentration.

However, the problem becomes more complex if the sample contains two constituents. In any algebraic solution, it is necessary to have as many equations as unknowns. In this case, it is necessary to set up two equations:

where and are the absorbances at two different wavelengths, and are the concentrations of the two constituents ("A" and "B") in the mixtures, and and are the absortivity constants for the two constituents at those wavelengths. Again, it is possible to solve each equation independently provided that the spectrum of one constituent does not interfere with the spectrum of the other (i.e., the bands are well resolved).

 

Hypothetical spectra of two different pure constituents A and B and a mixture of the two. Since the constituent bands in the spectra do not overlap, the selected wavelengths could be used to solve separate equations for both A and B.
Unfortunately, the equations above make the assumption that the absorbance at wavelength 1 is entirely due to constituent A and the absorbance at wavelength 2 is entirely due only to constituent B. As with the Least Squares Regression model, this requires finding two wavelengths in the training set of spectra that exclusively represent constituents A and B. With complex mixtures, or even simple mixtures of very similar materials, this is a difficult if not impossible task.

 

Hypothetical spectra of two alternative pure constituents, A and B, and a mixture of the two. In this case, the bands of the constituent spectra overlap, and the equations must be solved simultaneously for both A and B.
However, it is possible to get around this by taking advantage of another part of Beer's Law; the absorbances of multiple constituents at the same wavelength are additive. Thus, the two constituent equations for a single spectrum should really be:

 



All the equations presented so far assume that the calculated least squares line(s) that best fits the calibration samples is perfect. In other words, it has been assumed that there is no error in the measurements or in the predictive ability of the model when the equations are used to predict unknowns. Once again, this never happens in the real world; there is always some amount of error. The existence of error is what requires running more than two samples in the first place.

It is necessary to amend the equations one more time to add a variable to compensate for the errors in the calculation of the absorbance:

 


 

where and  and are the residual errors between the least squares fit line and the actual absorbances. When performing Least Squares Regression, the "offset" coefficient (a.k.a., intercept, bias) performs the same function. In these terms, the E values can be thought of as the calibration offset or bias. It is obvious to see that this will always be zero when fitting only two points (i.e., only two calibration mixture samples). However, as with most calibration models, Classical Least Squares usually requires many more training samples to build an accurate calibration. As long as the same number (or more) wavelengths are used as there are constituents, it is possible to calibrate for all constituents simultaneously.

The next problem is how to solve all these equations. If you have ever tried to solve simultaneous equations by hand, you know this is a very tedious process. If more than 2 constituents are present or more than two wavelengths are used, it gets even harder. A particularly efficient way of solving simultaneous equations is to use linear algebra, also known as matrix mathematics. This technique still requires many calculations, but the rules are straightforward and are perfectly suited for computers. In matrix terms, the previous equation can be formulated as:

 

or, more simply:

 

 

In this case, A represents a (2 x 1) matrix of absorbances at the two selected wavelengths, K is a (2 x 2) matrix of the absorptivity constants, C is a (2 x 1) matrix of the concentrations of the two constituents, and E is the (2 x 1) matrix of absorbance error, or offset.

This model can be extended to performing calculations using many more wavelengths than just two. In fact, as long as the number of wavelengths used for the model is LARGER than the number of constituents in the mixtures, any number of wavelengths can be used. In fact, it is not unusual to use the entire spectrum when calibrating Classical Least Squares models. In this case the matrices look like:

 

 

where A is a matrix of spectral absorbances, K is the matrix of absorptivity constants and C is the matrix of constituent concentrations. (The E matrix is not shown for space reasons, but has the same dimensionality as the A matrix.) The subscripts indicate the dimensionality of the matrix; n is the number of samples (spectra), p is the number of data points (wavelengths) used for calibration, and m is the number of constituents in the sample mixtures.

Using matrix algebra, it is trivial for a computer to solve these equations and produce the K matrix in the above equation (the matrix of absorptivity coefficients). Just by the nature of matrix algebra, the solution gives the best fit least squares line(s) to the data. Once the equation is solved for the K matrix, it can be used to predict concentrations of unknown samples.

For those familiar with linear algebra, to solve for the K matrix requires computing the matrix equation:

 

 

where  is the inverse of the constituent concentration matrix. Unfortunately, computing the inverse of a matrix requires that the matrix be square (having the same number of rows and columns). Unless the calibration set has exactly the same number of samples as constituents, this will not be true (remember, more samples are usually used to get the best representation of the true calibration equation).

This does not mean that the above equation cannot be solved. An alternative to computing the true inverse of the C matrix is to compute its "pseudo-inverse", as follows:

where is the matrix transpose (pivot the matrix so that the rows become the columns) of the constituent concentrations matrix.

This method has the advantage of being able to use large regions of the spectrum, or even the entire spectrum, for calibration to gain an averaging effect for the predictive accuracy of the final model. One interesting side effect is that if the entire spectrum is used for calibration, the rows of the K matrix are actually spectra of the absorptivities for each of the constituents. These will actually look very similar to the pure constituent spectra.

However, this technique does have one major disadvantage: the equations must be calibrated for every constituent in the mixtures. Otherwise, the ignored constituents will interfere with the analysis and give incorrect results. This means that the complete composition of every calibration sample must be known, and that predicted "unknowns" must be mixtures of exactly the same constituents.

This limitation of the CLS model can be more easily understood by taking a closer look at the model equation:

Notice that the absorbance at a particular wavelength is calculated from the sum of all the constituent concentrations multiplied by their absorptivity coefficients. If the concentration of any constituent in the sample is omitted, the predicted absorbance will be incorrect. This means that the CLS technique can only be applied to systems where the concentration of every constituent in the sample is known. If the mixture is complex, or there is the possibility of contaminants in the "unknown" samples that were not present in the calibration mixtures, then the model will not be able to predict the constituent concentrations accurately.

CLS Advantages
· Based on Beer's Law.
· Calculations are relatively fast.
· Can be used for moderately complex mixtures.
· Calibrations do not necessarily require wavelength selection. As long as the number of wavelengths exceeds the number of constituents, any number (up to the entire spectrum) can be used.
· Using a large number of wavelengths tends to give an averaging effect to the solution, making it less susceptible to noise in the spectra.

CLS Disadvantages
· Requires knowing the complete composition (concentration of every constituent) of the calibration mixtures.
· Not useful for mixtures with constituents that interact.
· Very susceptible to baseline effects since equations assume the response at a wavelength is due entirely to the calibrated constituents.

 

Back to top