Fitted area - uncertainty seems too high

Hi,

I posted a while ago on MultiPeakFitting 2.0's error report and have now come back to working with it.

I'm fitting two lognormal curves to bimodal data -- you'll see that the left-hand distribution was only fitted on the right-hand side due to tailing, I only care about the right-hand-area -- and the fit works out wonderfully. The right-hand peak is clearly visible, and MPF 2.0 does a great job on it...

However, the reported area for the right-hand peak area is 200 ± 9.06e+05. This just isn't reasonable to me. By inspection, I would expect something like 200 ± 100 for an uncertainty representative of the noise.

How might one objectively address this fit to get an estimate of the peak area?

Best,
j
It's hard to say without trying it myself. One possible issue could be floating-point truncation. Try scaling your Y values to make the right peak height similar in magnitude to the width. You should be able to re-scale the area and uncertainty in the area simply by multiplying by your scaling factor.

Another slight possibility is that the tail of the left peak is somehow contributing to uncertainty. Try fitting the entire left peak (which I see is actually recorded, just not fitted) and see what happens.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Hi John,

Thanks! Both of your comments were right on target!

The peak truncation I was doing (which is because the LHS and RHS of my peak are asymmetric, so the actual fit follows the data much more closely after truncation) increased the relative error from 0.08% to 58%.

But even more important was the magnitude of my y-values! Since my x-axis is max 5e-03, I scaled the y-axis to a 5e-03 and got 0.08% relative error! With 5e+06 as posted (partial 1st-peak fit), I had 993,788% error.

At 5e+00 -> 0.35%, at 5e+03 -> 5.09%, at 5e+06 -> 328.889%. Going the other way, I got 5e-06 -> 0.09% but 5e-09 -> 0.07%. Now my axis maxima are as far apart as they started, but it seems OK if y-max < x-max? Or just <<1?

This is pretty strange to me and I am puzzled about what to take from this. Should my y-values in a fit always be <= my x-values? Does floating-point truncation only affect large and not small numbers? Is this just part of the calculations MPF 2 is doing or is this a general lesson for Igor fits?

edit: I've made a plot of this behaviour:
uncertaintyRSD.png
In general, you should worry if the numbers involved in a fit contrast in magnitude by orders of magnitude. Strictly speaking, the problem is when you add or especially, subtract floating point numbers of greatly differing magnitude. A double precision number has about 16 digits precision. So if you add, for instance 1234567.89123456 plus 0.000000123456789123456, only the 1 in the small number has any significance, and may not actually change the value of the larger number!

That is an extreme example, but smaller losses in precision can be important in curve fitting as the truncation errors build up during repeated iterations.

These sorts of problems are greatly increased when exponentials are involved.

It's possible that the code for Multipeak Fit 2 should be doing the scaling for you. I have it in mind to investigate that possibility as some unspecified point in the future :)

The same sorts of considerations apply if you have a fit that involves fit coefficients that are different by orders of magnitude. That can be pretty difficult to deal with by scaling, though, if the fit function is nonlinear.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Maybe an easy tweak to MPF 2.0 is to warn the user when the values differ greatly, until you have time to actually work in a normalization?

Thanks for the help, this has been very enlightening! Had I been working near the "moderately bad" region of this effect, I may never have realized.

j
Just a follow up for others' benefit.

I have found that fitting peaks in Igor 6.2 with very high y-wave to x-wave ratios (e.g. y being 10^20 larger) yields the error

"Multi-peak Fit failed:

singular matrix or other numeric error"

This is consistent with John's answer above, but if you had not run into this before you might not think of this, especially since a normal CurveFit does still work.

The solution is to upgrade to Igor 6.3 (tested for 32-bit Windows version).

edit: this applies to the fitting *failure*, NOT to the original issue of the fit area uncertainty being too high -- that's still there.
I realize this topic may be a bit dead- but I want to express my frustration a bit with the multipeak fit2 as well. I am attempting to fit x-ray diffraction data which has to be viewed on a log-scale to see the nuances. (easily done in igor with changing graphics properties)- but it appears all manner of attempts to fit this type of data with multipeak fitting is failing for me. I'm now wondering how commercial packages like Jade are able to accomplish this, or if there's a way around this problem with multipeak2. I need to fit both a voigt peak to the main substrate peak diffraction intensity as well as smaller peaks due to film which are generally gaussian in nature. I am going to attempt to play piecewise and see how I get on, but I think that this current problem (and my lack of Igor knowledge) will prevent me from using Igor for this type of fitting. Peak manual positioning is a very nice feature and very useful when you suspect you have one or two major peaks due to the material being analyzed.

Wishing everyone well, and I wish I could handle these files in multipeak.
-Allen
Hi John!

Well, this is a bit crazy for data possibly, as the range is so high- I'm hoping to learn a bit about what you suggest from this! I have other friends in the field who are also intrigued by the multipeak2 fit routines- they're quite nice! [assuming they can be gotten to work with a range of data- already seen it work with one set that isn't the same as this]

I'm attaching my pxp - please be aware that I'm no expert here- I have two tables with a column each- I pull an x and a y into the multipeak routine, then work from there. The data is imported from HDF5 and originates in matlab from my work on the Phillips XRDML files. I'm a lot more comfortable in Matlab (igor for matlab dummies anywhere? ;) ) - so I tend to do my work there, except this multipeak fit was too good to pass up a trial in Igor. Knowing more about Igor may spill over into AFM analysis etc., as well, so I don't mind learning a little bit here- but also have to get my paper writing done. ;)

Thanks so much for taking a moment to check out the data. I'm a bit concerned that rescaling the data will make the coefficients of fit etc., difficult to use- as the relationship with changes in x or y may be non-trivial. Working on the log of the data also doesn't appear to be simple, as the peak types would have to change drastically to be useful back in real-world intensity values. [i.e., the log of a gaussian, log of Voigt etc.]

Thanks for your time and thoughts!!
-Allen

ps- I noticed that there may not have been a way to scrub identifying info in the .pxp file before upload- interesting file- a mix of binary and xml/plist type stuff. Only thing there is hard-drive name etc., as far as I can tell. :)
ah3216recslit004.pxp
Allen-

I see one big peak in your data. Multipeak Fit 2 fits that one peak pretty well with a Voigt peak shape. You say "as well as smaller peaks due to film which are generally gaussian in nature". Please tell me where they are.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Oh sorry, John! I don't believe the data is proprietary, thanks for asking, though! I'm mostly just concerned with info that could be mined by robots on the web etc. :D No worries about either on this one, actually, I should have asked if there was a method for posting properly.

Thank you!
-Allen
Hi John! If you plot the fit with a log axis on y, then you will see the other peaks. I fear this is the difficulty no one talks about with XRD files- it's frequent that this type of spread would be seen in materials in xray diffraction. (wide range of strong and weak peaks)

-A

johnweeks wrote:
Allen-

I see one big peak in your data. Multipeak Fit 2 fits that one peak pretty well with a Voigt peak shape. You say "as well as smaller peaks due to film which are generally gaussian in nature". Please tell me where they are.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

Ah, yes- I see them now. And you're right- to give sufficient weight to the small peaks relative to the really big one would require something like a log transformation. And you are further correct in thinking that Multipeak Fit isn't set up to do such a thing.

It's too bad- I'll have to think about how to accommodate it. Do you know what other packages do?

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Unfortunately, John, I don't know of any other packages that do at the present time. I highly suspect any x-ray diffraction powder analysis software specific for this type of work does have to do something similar to this. (my brain is telling me "powderx" for windows 95 is one of them)- and I think there's a matlab suite that may address this mtex or something but i haven't dug very far yet. It's my first foray into this type of fitting manually without using software for it. The industry standard is JADE for powder diffraction. It's a few thousand or so. [I have access to this, and will revert to this most likely due to time-constraints. It does seem to be a tricky thing to figure out, unfortunately.

I really appreciate you taking the time to comment on this as I initially thought this would be easy until I dug a bit deeper.

My very best to you, and love the program BTW! I will likely be using it for other data in the near future! Thank you for your kindness and time!
-Allen

johnweeks wrote:
Ah, yes- I see them now. And you're right- to give sufficient weight to the small peaks relative to the really big one would require something like a log transformation. And you are further correct in thinking that Multipeak Fit isn't set up to do such a thing.

It's too bad- I'll have to think about how to accommodate it. Do you know what other packages do?

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

I have not see the file or data but have two suggestions.

* Fit the big peak. Subtract the fit from the raw data. Does this reset the scale so you can see the smaller peaks now? If so, fit the smaller peaks.

* Fit the big peak. For all range of the big peak, set the values to NaN. Does this reset the scale so you can see the smaller peaks now? If so, fit the smaller peaks.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville