How error in coefficients are calculated in non-linear curve fitting?

After a curve fitting process we get optimised values of coefficients as follows-

coefficients = value +- SD

My question is how this SD for coeficients are calculated?

Let us consider I'm fitting my data having n number of points with a model containing c no of coefficients. The wave contaning initial guess of coefficients named as coef (say). Say the iteration stops after 35.

I have assumed the following two cases-

Case 1. A matrix M contains all the coefficients used for each iterations as columns (say), i.e., after each iteration a new column gets added to the matrix M. Thus the dimension of M will be c x 35. Now, SD are calculated for each row M(i,:) and the results are presented as K_i = mean_i +- SD_i where K_i is the i-th coefficient in the coef wave.

Case 2. A matrix M contains all the coefficients used for each data points as columns (say). Thus the dimension of M will be c x n. After each iteration the values in the matrix M are replaced. Now, SD are calculated for each row M(i,:) and the results are presented as K_i = mean_i +- SD_i where K_i is the i-th coefficient in the coef wave.

Now if the case 1 is true then bag guess also contributes to the optimised coefficient values (in SD). I believe this is not the case. Also, in certain cases no of iteration might be very low and then SD would be meaningless.

But if the case 2 is true, then both the above situations will not arise as at the final iteration we will have a distribution of coefficients most suitable for each data points and hence, their SD is also meaningful. Currently, I'm considering the case 2 as true.

Please, clarify whether I'm thinking in right direction or not.

N.B. According to L-M algorithm, I know a small shift vector is added to the initial guess vector after each iteration (may or may not be for each data points) to lower the ssq. So, the above said matrix M can also keep track of that shift vector, right. But for simplicity, I assumed M contains coefficient values.

 

Thanks in advance,

Subhamoy

The parameter uncertainties from LM Least Squares are estimated by inverting the Hessian matrix to get the covariance matrix, the leading diagonal of which provides the variance on the parameters. The Hessian matrix contains the second order partial differential of chi2 with respect to all the parameters.

In reply to by andyfaff

andyfaff wrote:

The parameter uncertainties from LM Least Squares are estimated by inverting the Hessian matrix to get the covariance matrix, the leading diagonal of which provides the variance on the parameters. The Hessian matrix contains the second order partial differential of chi2 with respect to all the parameters.

Do the values in this Hessian matrix get replace after each iteration? What is the dimension of this matrix?

Please read about Levenberg-Marquardt least-squares fitting in Numerical Recipes in C, chapter titled "Modeling of Data" or any other book that covers LM fitting. We find Numerical Recipes to be a very approachable description.

The Hessian is NxM matrix where N is the number of data points and M is the number of fit coefficients that are not held. Each element of the matrix is a partial derivative of the fitting function with respect to a given coefficient, evaluated at the X value corresponding to the row of the matrix.

In reply to by johnweeks

johnweeks wrote:

Please read about Levenberg-Marquardt least-squares fitting in Numerical Recipes in C, chapter titled "Modeling of Data" or any other book that covers LM fitting. We find Numerical Recipes to be a very approachable description.

The Hessian is NxM matrix where N is the number of data points and M is the number of fit coefficients that are not held. Each element of the matrix is a partial derivative of the fitting function with respect to a given coefficient, evaluated at the X value corresponding to the row of the matrix.

 

Thanks johnweeks...I got my answer....and yes I will certainly go through the reference...thank you andyfaff

> The Hessian is NxM matrix where N is the number of data points and M is the number of fit coefficients that are not held. Each element of the matrix is a partial derivative of the fitting function with respect to a given coefficient, evaluated at the X value corresponding to the row of the matrix.

I'm pretty sure the Hessian is square. I thought the Jacobian (J) is evaluated as N*M, with each element holding the partial derivative of the squared difference w.r.t the parameter/data point. The Hessian is then approximated at J^{T}J.

Always happy to be corrected though.

I stand corrected (literally: I have one of those fancy desks!)

What I described is the Design matrix.

But the intent was still achieved: the Hessian and Design matrices are replaced at every iteration, otherwise there would be no progress in the fit. There are some fancy optimizations (that we don't use unless you use the ODR fit option) where only pieces of the matrices are updated at each iteration.