lognormal fitting

I have tried the email list, but i do not know what happens and it is not working for me. i come back to the forum.

the problem: i have an xy data set that looks like a nice lognormal distribution. however, when i fit a get this ridiculously high chi-square value, which does not mean anything because i do not have a weighting wave -nor will i have one.

the question: which is a general question for fitting - how can i evaluate goodness of fitting without a weighting wave? is there something like a kolmogorov test that i can do (or anything else)? how? i mean, the xy data set comes from two different waves, and cannot be compared to the fitted function, which is just a single wave. If i try to interpolate the xy set to a single wave, i get meaningless waves. what am i doing wrong?

cheers

P
pjfd wrote:
the problem: i have an xy data set that looks like a nice lognormal distribution. however, when i fit a get this ridiculously high chi-square value, which does not mean anything because i do not have a weighting wave -nor will i have one.
In the absence of a weighting wave, the value of chi-square reported by Igor is simply the sum of the squared residuals. So if you have lots of points, or if the residuals are large, you will get a large chi-square.
Quote:
the question: which is a general question for fitting - how can i evaluate goodness of fitting without a weighting wave? is there something like a kolmogorov test that i can do (or anything else)? how?
Well, according to Numerical Recipes, if you don't have a weighting wave (which gives the expected distribution of residuals) you can't really assess goodness of fit, because you can't tell if the residuals are unexpectedly large (which would happen when you fit a model that doesn't represent the underlying data). Igor reports the estimated errors for the fit coefficients; is that useful to you? You can also compute a reduced chi-square, V_chisq/(n-m) where n is the number of points in the data set, and m is the number of fit coefficients. That will at least give you an idea of the size of the residuals.
Quote:
i mean, the xy data set comes from two different waves, and cannot be compared to the fitted function, which is just a single wave.
You can get a wave with a model value for each point in the input data. Using the Curve Fit dialog, select the Output Options tab. Select _New Wave_ in the Destination menu. Fill in a name for the wave. The command generated will put model values into each destination point corresponding to the X value for that point.
Quote:
If i try to interpolate the xy set to a single wave, i get meaningless waves. what am i doing wrong?
That's hard to say without knowing more about your data. But if the X values are not sorted, interpolate will give pretty weird results. Also, if you have NaN's (Not a Number, or a blank cell) in your data, you will get local patches within the interpolated data that are NaN's. I believe the cubic spline interpolator (Interpolate... at the bottom of the Analysis menu) will fail completely with NaN's under some cirumstances.
John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Quote:

But if the X values are not sorted, interpolate will give pretty weird results. Also, if you have NaN's (Not a Number, or a blank cell) in your data, you will get local patches within the interpolated data that are NaN's.


The Interpolate2 operation (Analysis->Interpolate) automatically sorts input data and removes NaNs before doing the interpolation.

Quote:

I believe the cubic spline interpolator (Interpolate... at the bottom of the Analysis menu) will fail completely with NaN's under some cirumstances.


Only in the "X Coords From Dest Wave" mode when the destination is an XY pair and the X destination wave contains NaNs. This is very rare. The next release of Interpolate2 will tolerate the NaNs in this situation.
johnweeks wrote:
You can get a wave with a model value for each point in the input data. Using the Curve Fit dialog, select the Output Options tab. Select _New Wave_ in the Destination menu. Fill in a name for the wave. The command generated will put model values into each destination point corresponding to the X value for that point.


Then plot the Y wave of model values against the Y wave of data values, while is called a Q-Q plot (for Quantile-Quantile). The closer this line is to a straight line with slope=1, the better the fit. That gives you a visual assessment of fit. A Kolmorgov-Smirnov test (StatsKSTest) of the two waves (model values and data values) will assess this statistically.
Hi guys,
thanks for the help. the newwave and kolmogorov-smirnov solved the problem. this forum is very helpful.
i am still troubled by the weighting function. how can i get one? actually, i do not even know what the residuals are.
the reduced chisqr looks ok, but how do a get the degrees of freedom? it is not going to be N-2 if my data set is over 1000 points.
P
pjfd wrote:
i am still troubled by the weighting function. how can i get one?
A weighting wave contains one point for each Y data point. It holds the standard deviation of the estimated measurement errors for each point. These are often the same for every point. You might get this information by doing multiple samples for each point and computing the average (Y data) and standard deviation (weight) of the multiple samples. You might know it from a knowledge of the measurement process, and there are other ways to get it. Or you might not have that information.
Quote:
actually, i do not even know what the residuals are.
Residuals are the difference between the fitted model and the Y data. It's what's left over that wasn't accounted for by the model.
Quote:

the reduced chisqr looks ok, but how do a get the degrees of freedom? it is not going to be N-2 if my data set is over 1000 points.
No, it will be N-4 because the built-in LogNormal fitting function has four coefficients.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com