Welcome Guest from United States
Sign In Change Country
  0 Items
Search:
Algorithms - PeakFitting by Non-Linear Least Squares

Algorithms

Peak Characterizations, Position by Center of Mass

One of the more difficult problems in science is finding an equation that fits the shape of a curve that is represented by more that one functional shape or form. Since there are many solutions to fitting even two equations to one curve, "plug and chug" or linear regressions will not work. The solution needs to be the best combination of all the parameters for the two (or more) functions. In general, non-linear curve fitting methods involve fitting a series of individual functions simultaneously, in order to obtain the single "best fit" solution. How this "best fit" (merit criterion) is calculated and how the fitting variables are adjusted is the basic difference between the methods.

The solutions are found by iteratively trying a series of combinations of the parameters until the best one is found (as predetermined by the merit criterion). Since the solutions are interdependent, small changes in one of the parameters, affect the final result of all the others. When using an iterative process, the starting point should be as close to the actual solution as possible. Good "guestimates" for the starting values increase the probability of finding the "best" solution.

As always no method is perfect, and unfortunately, since there are a number of answers to a non-linear problem, many times the fit will end up in a "local minimum", which may not be the best possible solution. Many fit routines will then continue to iterate using solutions that are significantly different from the minimum, trying to find another "better" solution. If one is not found then the minimized solution is considered "best".

Other problems occur if the user input values are significantly far from the "real" answer. This may end in a solution that is stuck in a local minimum that is determined to be the best fit. In extreme cases the fit may even "walk away" to a ridiculous solution. The only way to recover from this problem is to re-enter the starting fit values. Over-fitting is another common problem. Better fits can always be obtained with more input variables (given enough variables you can fit anything). Thus it is important that the input parameters reflect the proper number of variables based on the physical measurement.

Using non-linear methods requires a threshold at which the "fit" is considered "good" (i.e. minimizing the merit equation to some near zero value). In the case of the Levenberg-Marquardt method, the merit equation used is the equation. The final solution is found when a minimum in the Reduced equation is reached. It is a statistical measure of "goodness-of-fit", inversely proportional to the known variance of the data set. The published tables give the threshold for various degrees of freedom and different significance levels. The tables however are based upon the assumption that the error or uncertainty is known exactly, which is usually not the case.

General Equations

The general non-linear fit equation can be approximated by the following quadratic form: where is the merit function, the parameter used to determine what the best fit is for varying (a), the set of M unknown fit parameters (a1, a2,…aM).  is the shape or curve that you are trying to fit. D is an MxM Hessian matrix, the second partial derivatives of the functions used for fitting and d is a gradient vector (steepest descent of size M), the first partial derivatives of the functions.


For poor initial approximations, the method of steepest descent will localize the fit parameters by finding the next parameter values (anext), using the current fit parameters (acur), via the following equation:


Whereas if the initial guess is fairly close, the Hessian matrix method works better for finding the minimized values (amin), by using the current fit parameters (acur):


Upon carrying out the partial derivatives of the merit equation,, a factor of 2 is introduced. Two parameters are used to remove the factor of 2 resulting in a vector () and matrix () representation form of the fit parameters:


Upon determining the partial derivatives from the gradients, a second derivative term arises, causing a destabilization during the fitting routine. The contribution of the second derivative term, which tends to cancel itself out when summed over all the data points (N), can be neglected, simplifying the term. Therefore, ignoring the term gives the following for the Hessian matrix, when summed over all the data points in the curve, N:



Using the inverse of the Hessian matrix, the step size parameter can be rewritten as a set of linear equations which can then be solved for the new step size:

The new step, is then added to the current value, and tested in the merit equation for "best fit". In the method of steepest decent, the equation for determining the new step size is as follows:

This value is then subtracted from the current value to give the new parameters for testing the "best fit".

The final "best fit" solution is found when is at a minimum, or when thevalues are 0 at all k values. Note that any changes inwill not affect the final parameter fit values, since its only purpose is to determine the rate (i.e. the step size) at which the minimum is obtained.

 

Back to top