User defined fit function does not find dependencies

Trying to fit breakpoints is very hard. By "breakpoints" I mean your b and x1 coefficients that try to find the appropriate places to "break" up or down.

The problem is with derivatives- the fit depends on partial derivatives of chi-square with respect to each of the coefficients. Mathematically, you function has delta function derivatives for b and x1. The curve fit approximates the derivatives numerically by perturbing the coefficient value a bit and seeing how much difference that perturbation makes to the chi-square value. The fit uses the gradient in chi-square to figure out where to go next in trying solutions.

When I say "a bit", that means by default 1e-10 times a coefficient value if the coefficient wave is double precision, or 1e-6 for single precision (never use a single-precision coefficient wave). So you see, if you have an x1-b/2 that falls in the middle between data points, and perturb that by 1e-6, it makes no change because the values only change as the breakpoint crosses a data point. A zero derivative results in a singular matrix error and the message you report in the history saying that the fit doesn't depend on b and x1; in fact, in the range seen by the fit, it does not depend on those coefficients.

The solution, which sometimes works for cases like this, is to provide and "epsilon" wave that specifies the differencing perturbation. I used this:

make/D/O eps={1e-6, 2, 1e-2, 2}

I also tried it with 1 instead of 2, and that worked as well. Here is the history report:

•FuncFit square W_coef ydata /X=xdata /D /E=eps 
  FitProgressDialog allocating a dialogFitFunction instance
  Fit converged properly
  fit_ydata= square(W_coef,x)
  W_coef={660.81,1.5502,1.9964e+05,99.976}
  V_chisq= 2.95678e+11;V_npnts= 65;V_numNaNs= 0;V_numINFs= 0;
  V_startRow= 0;V_endRow= 64;
  W_sigma={1.13e+04,0.13,2.72e+04,0.109}
  Coefficient values ± one standard deviation
    y0  =660.81 ± 1.13e+04
    b   =1.5502 ± 0.13
    A   =1.9964e+05 ± 2.72e+04
    x1  =99.976 ± 0.109

The value of y0+A is well below the peak. I think that's because it is balancing the skirts of your square wave against the top, and it finds a level about halfway between.

My guess is that your data is some sort of response function to a square-wave input. Is that right? If so, the appropriate model may be a square wave convolved with some response function, perhaps a delayed Gaussian would work, or some sort of double-exponential peak with a fast rise and slow fall. Such a function would model the tails of your data. Writing the fitting function to do that is not trivial. I wonder if the wonderful folks on this forum have done something you might borrow.

Log in or register to post comments

April 16, 2020 at 12:44 pm - Permalink

johnweeks

By the way, here's your experiment file with my work in it.

Attachments fitfunction_exampleJW.pxp (21.79 KB)

Log in or register to post comments

April 16, 2020 at 12:50 pm - Permalink

jjweimer

Depending on the need, I might break this into two curve fits. First, find the maximum in the data. Then, fit the rising portion (data increasing to maximum) with a sigmoidal function. Then, fit the falling portion (data after the maximum) with that same sigmoidal function.

Alternatively, you could do better when you would either "hard code" the values of the baseline step change and the peak position or hold them constant (fit constraints) during the fitting.

Finally, you will likely have an advantage at some point later when you convert your ydata versus xdata approach instead to using a scaled wave. Your xdata wave is monotonic as far as I can tell.

Log in or register to post comments

April 16, 2020 at 04:02 pm - Permalink

KurtB

Alternatively, you could fit to a function that is smooth but approximates to your top-hat function. For example, one could use a super-gaussian function such as this:

Function SuperGauss(w,x) : FitFunc
    Wave w
    Variable x
 
    //CurveFitDialog/ These comments were created by the Curve Fitting dialog. Altering them will
    //CurveFitDialog/ make the function less convenient to work with in the Curve Fitting dialog.
    //CurveFitDialog/ Equation:
    //CurveFitDialog/ f(x) = y0 + A * exp ( - ( ( ( (x - x0) / width) ^ 2) ^ P ) )
    //CurveFitDialog/ End of Equation
    //CurveFitDialog/ Independent Variables 1
    //CurveFitDialog/ x
    //CurveFitDialog/ Coefficients 5
    //CurveFitDialog/ w[0] = y0
    //CurveFitDialog/ w[1] = A
    //CurveFitDialog/ w[2] = x0
    //CurveFitDialog/ w[3] = width
    //CurveFitDialog/ w[3] = P
 
    return w[0] + w[1] * exp ( - ( ( ( (x - w[2]) / w[3]) ^ 2 ) ^ w[4] ) )
End

If you hold the parameter P to 1 you just have a standard gaussian function. Higher values of P make the shape more like the top-hat function. You could allow it to vary or hold it at a large value (e.g. 10 or 50).

I cast the super gaussian function above to be based on the built-in gaussian so one could run a standard gaussian fit to get a good set of starting values for the super gaussian.

Log in or register to post comments

April 17, 2020 at 01:51 am - Permalink

johnweeks

That superGaussian looks pretty good, though it sort of makes it obvious that the data are asymmetric.

One thing we need to know: what is the goal of the fit? What are you trying to learn? And what sort of system is this? What sort of model is appropriate?

If you're trying to estimate the width of the feature, it's possible that FindLevel or FindPeak might be what you want.

Log in or register to post comments

April 17, 2020 at 10:56 am - Permalink

bibesy

Thank you both for your help!

Ok so it doesn`t seem to be a syntax problem, but the fit itself. I got a better understanding of what is going on now.
Yes it is true, it kinda is the response to a square input. But it is not clear what the response function is.

I am basically looking for two parameters:
- The x-position of the middle of the square wave that was the input.
- The area under the curve, or the width of the curve.
I agree this determines which fit to use.
For the x-position, the square fit with epsilons works well, for the area not so much. The flat-top gaussian a) works out b) is a good representation of the data when only looking for area and position. It doesn`t quite get the start and tailing, but the parameters I need are fine. I would prefer to get the tailing right, but it is not so important.

Attachments fitfuncexample2.png (26.33 KB)

Log in or register to post comments

May 5, 2020 at 10:52 am - Permalink

jjweimer

I remain confused.

You say that the curve that you are fitting is a measured RESPONSE to a square-wave input. The starting point of any input to a system must be BEFORE the response curve of the system, not within the envelope of the response curve. You will not get the right answer for the middle position of the square wave that was input when you do not have the starting point of the input at the right location.
You say that you prefer to get the tailing right, but then you say that it is not so important. As it is, your fitting function "prefers" to get the height and width of the RESPONSE curve "right". You will not get the tailing function right when it is convolved as (hidden by) a symmetrical Gaussian function.

Can you explain again exactly what you are trying to do from an analytical or system-analysis perspective? You have a system that is given a square wave input. You measure the output function but (apparently) have no clue about the exact shape of that input square wave. Are you trying to guess-backwards to deconstruct an accurate measure of (square-wave) input wave that gave the measured system response? Or are you trying to model the output function of the system accurately regardless of the exact (and unknown) shape of the square wave input?

Log in or register to post comments

May 5, 2020 at 01:24 pm - Permalink