Constraining the value of a fit function

I am interested in applying a restraint to a fit function which is not an explicit constraint on the values of the fit coefficients. Instead, given some smooth curve ((X,Y) data), I would like to restrain the values of the fit function, f(X), with respect to that data. I think (?) this amounts to an implicit constraint on the fit coefficients since they would be limited to the set of coefficient values such that at each X the f(X) satisfies the constraint with respect to the corresponding Y.

For my specific case I would like to constrain the fit function to not exceed the data. This constraint would be something like: constrain the resulting fit coefficients of f(X) such that for each X, f(X) < Y. The purpose of this is that my data (heat capacity of a material vs temperature) has different contributions with different functional forms. I know the functional form for one contribution which dominates at high temperature and I would like to fit the data with that function so that I can subtract it (i.e. subtract the resulting fit function evaluated at each temperature) and reveal other (smaller) contributions which dominate at low temperature. Since a negative heat capacity is unphysical, I would like to enforce that the value of the fit function at each temperature is always <= the measured heat capacity (Y value) at that temperature.

It seems like to do this the fit function needs to have access to both the X and Y values of my data so that it can compare f(X) to Y after each cycle to see if the current fit coefficients result in f(X) > Y for any (X,Y) pair in the data. I am not sure how to implement this in a fit function since usually these functions only take "w", the fit coefficient wave, and "x", the value of the independent variable at some point.

Is something like this possible in Igor? 

 

Even though you can look up the data from within your fit function (DisplayHelpTopic "WAVE"), this doesn't sound to me like something you should try to do with curve fitting. Any kind of abrupt penalty for moving into positive residuals will result in a discontinuous, or a least non-smooth, chi-square surface, which is not going to make Levenberg-Marquardt happy.

Maybe take a look at some asymmetric least squares smoothing algorithms for inspiration.

You could try to trick the fitting algorithm by fitting wave0 = 1 with a function that calculates, say, 1 + 10^(f(x)-data(x)). This will discourage, but not necessarily prevent positive residuals. Even if it works, I'm not sure how meaningful such a fit is.

 

 

 

Gotta agree with Tony. If I have understood what you are describing, it is the very definition of statistical bias. Least-squares curve fitting is designed to pass a model through the middle of a data set that is assumed to be model + symmetric errors.

There may be another way to do what you want. Perhaps if you were to post an Igor experiment file containing your data, and a graph with some notations on it pointing out the features that you are trying to pick out, we would be able to help you find some method to analyze it.

I force my model f(x) to range of physically meaningful values routinely.  You can reference any needed global values (waves or variables) from within the fit function and enforce result value by simple if comparison; just keep in mind that if you use regular fitting, this is per point, if you use all-at-once fitting, this is on resulting wave. Fit functions in Igor can be surprisingly complicated and do complex stuff...  They can call any other routines and global stuff, as long as one is careful what changes where and how. 

In my case this seems to work just fine. 

I see two parts to your question.

* You appear to want to put more emphasis on one region of data (the high temperatures) than another during the curve fitting. Perhaps you should consider whether you can set appropriate (objectively designed) weighting factors during the curve fit rather than setting an absolute design limit.

* You do not want a theoretical model to exceed experiment values. This opens a few potentially false premises. First, the validity of this proposed approach presumes that your experimental data are perfectly accurate, meaning that they are God's absolute truth about the value with absolutely no ambiguity. Secondly, the validity of this proposed approach presumes that your experimental data are perfectly precise, meaning not only are they true, but also you were a perfectly true measurement system when you observed them. Alternatively, no one is ever permitted to exceed your experimental values ever, and you are reporting your values with an infinite number of significant digits. Perhaps you should consider that the relative magnitudes of negative deviations between experimental values and your theoretical model are not to be avoided but rather embraced because they indicate a deeper level of confidence that you can report in your model and/or your experimental precision.

 

 

Hi all,

Thank you very much for your insightful comments. 

I understand that there are some legitimate statistical objections to biasing the fit in the way I have described, so let me just reiterate my justification.

I have measured the heat capacity (C(T), where T is temperature) of a material. The heat capacity in this case contains two significant terms: C(T) = C_L(T) + C_M(T). The C_L term is dominant at sufficiently high temperatures and goes monotonically to 0 as T --> 0. C_M is dominant at lower temperatures and becomes insignificant at higher temperatures (it also goes to 0 as T --> 0). My goal is to isolate C_M.

I have a model for the C_L term: C_L(T) = E(T) + D(T), where E(T) and D(T) are the Einstein and Debye models for the heat capacity. I do not have a model for C_M. I seek to extract C_M by doing the subtraction C_M(T) = C(T) - C_L(T). I obtain C_L(T) by fitting C(T) to the aforementioned model in a restricted range at high temperature where I know C_M is ~ 0 and thus C ~ C_L. Since heat capacity is positive definite (by virtue of its relationship to entropy) it is unphysical for my model of C_L to exceed C at any point since that would imply C_M < 0. 

jjweimer makes a good point about assumptions concerning accuracy and precision. If I was purely interested in the values of the parameters from my model for C_L than there would be no reason to restrict the residuals to be positive definite. However, in this case, the model for C_L is understood to be an approximation which in turn yields an approximation of C_M. It seems reasonable in this case to constrain the model for C_L to physical values (as well as constraining the parameters within the model to physical values) even if it is introducing statistical bias. Suffice to say that no one in my field would take my resulting C_M as the absolute truth, but it does give qualitative information and some quantitative information (which is taken with a grain of salt). 

I have attached an experiment file which has the data I have been working on recently and implements my fit for C_L ("debyePlusEinstein)). I tried to modify this function ("debyePlusEinsteinRestrict") to prevent the model from ever exceeding the data in the fit range. I have attached two plots which show the data (black squares), the fit (solid red line), the extrapolation of this fit to low temperature (broken red line) and the result of subtracting the extrapolated fit from the data (green circles) i.e. my estimation of C_M. One plot shows these with the normal fit function and one with the restricted fit function. My attempt at restricting the fit did not appear to work since C_M goes negative at high T in both cases. 

I will note that in this particular case the model for C_L works quite well, and only exceeds the data by a very minor amount (a few tenths of a percent). So, this data is not really a great example. However I would still like to know how to do such a restriction for future analysis of other data. 

Actually, your data are remarkably clean. You may actually have a chance that this comes out well :)

So a curve fit assumes a model something like

f(xi; p) + ei

where f(xi;p) is some ideal model at independent variable values xi for some set of parameters p. The ei are values of measurement error for each data point. Those are presumed to be random values drawn from some probability distribution. When you do a curve fit, you are looking for a set of p that is most likely to be the "true" values of p. To the extent that is so, the fit residuals are an estimate of the ei.

So the residuals have nothing to do with whether or not negative values are physically reasonable. The have only to do with the distribution of measurement errors. (Well, there may also be systematic errors introduced by a model that isn't quite correct...)

Any extrapolation is suspect, but since you are fitting over a large range and extrapolating over a small range, you may get away with it, especially as your residuals are clearly quite small. But I decline to offer tech support for trying to figure out the extrapolation error bars!

Your *model* might be constrained to not go negative, or something like that. But your residuals probably shouldn't be so constrained. What is the source of measurement errors? One source is the presence of your C_M unknown contribution. Over the fit range is it negligible compared to your model values? I see that peak at the far left- I guess that is the bulk of C_M? If C_M is much smaller than measurement errors over the fit range, then it seems safe to ignore it. Then your residuals are almost entirely measurement errors, and your estimate of C_M is "correct" to within measurement and extrapolation errors.

OK, now I've looked at your experiment file...

Zooming in on the residuals (the green dots, right?) it's clear they aren't random. They have a curvy look that is common when something like an exponential curve almost but doesn't quite describe the physics. I also see that the residuals get smaller as the data values get smaller, suggesting that you might want to do a weighted fit. How you estimate weights is another discussion that can be quite lengthy!

Do you have any notion of the shape of C_M? Is it your expectation that your Einstein-Debye model is accurate for the fit range?

I am about at the limit of my ability to say anything about this because I don't know the physics involved.

> However, in this case, the model for C_L is understood to be an approximation which in turn yields an approximation of C_M. It seems reasonable in this case to constrain the model for C_L to physical values (as well as constraining the parameters within the model to physical values) even if it is introducing statistical bias.

Reading the first sentence tells me that C_L is allowed to yield analytical values that are equally probable to be either below or above C(T) at any given T both because C(T) has statistical randomness (measurement uncertainty) and because C_L is not *an exact theory that can never be violated over all portions of the data*. To the latter point, your fit to C_L could just as well be over-emphasizing the higher temperature portion of the data, and thereby causing C(T) - C_L to be biased to be less than zero *simply due to how the approximation involved in the equation is biased as a function of temperature*.

Also, you have the Debye and Einstein temperatures as fitting coefficients. Are they not known for the material or at least a base of the material?

Indeed, if the Debye temperature is truly near 800 K for the material (based on your fitting result), do you have enough temperature range in your heat capacity to trust that you are able to represent the Debye contribution well enough? Or vice-versa and/or for the Einstein model.

In this regard, I must second the question by @johnweeks about the physics. When you are wanting to remove the high temperature deviation, perhaps one approach would be to apply weightings that remove or emphasize the Debye contribution as temperature increases, allowing the Einstein model to dominate more or less appropriately.

In the meantime, I suggest looking up how to use Static Constants in functions to allow you to replace the multiple uses of R = ... in your three functions. A few other cleanups in the code would help make the fitting process run faster and perhaps allow the flexibility to include a weighting factor.