Curve Fit Data Mask Problems

Hi All,
I'm attempting to fit some data to the equation for Blackbody Radiation. The equation is long and drawn out, but I'll paste it here with a link explaining it anyway.

(((2*pi*(299792458^2)*6.62606957E-34)/((x*(1E-9))^5))*(1/(e^(((6.62606957E-34)*299792458)/((x*(1E-9))*(1.3806488E-23)*T))-1)))

(Just for the record, you have no idea how long it took me to get the parentheses right for that equation...)
http://hyperphysics.phy-astr.gsu.edu/hbase/bbrc.html

T is temperature, x is Wavelength in meters.

The graph takes place on the x axis in the nanometer range, so 1E-9, this is why you see the x*1E-9 in the equation. Equivalently, I could have not used the x*1E-9 and simply changed the x-axis range on the graph to be very small, however doing this makes the fitting program run... slowly. Either way it works well though and the traces match up with what's on the hyperphysics page.

Anyway, I've made a procedure that reproduces these graphs using the above equation at any temperature you wish. I'll attach the procedure. It'll show up under the macro menu called "BBRadiation", or you could just type in BBRadiation() to run the function itself, both methods are acceptable. The function is not required for the fitting program to work. The function just makes pretty graphs.

Anyway, I do raman spectroscopy at high temperatures, and I was trying to fit some of my raman data with this equation, so I can figure out A) what the temperature of the surface actually is and B) the detection limit for my detector.

I use a laser wavelength that overlaps with blackbody radiation at the temperature I'm using, so I see BB radiation in my Raman spectra. If I'd use the UV laser in the lab, I wouldn't see any BB radiation, and if I use a laser with a higher wavelength, I'd see much more.

Reasons aside, I've attached some data in an igor file. If you run curve fit, select "BBRad" fit, then "From Target", then "User cursors" and type in 950, .1, 500 for the guesses, the fit should converge properly.

That's great, now, if you try to "Select Points for Mask" from the "Graph->Packages menu bar" then create polygons around the two peaks between the cursors in the data, then type in "NaN" for "inside" and "1" for "outside", and try to run the fit again using the newly created data mask, it won't work. It'll say "40 iterations without convergence" This is the same result for any variation of what I type for "inside" and "outside" the polygons.

Am I doing something wrong here? Why would excluding that data make the fit not work? Is there a way to increase the number of iterations run? The 40 iterations is run instantly, so it wouldn't hurt me at all to give it 100 or so. Then again the fit may be oscillating between two points and it'll never converge, but I don't know how to check that.
BBRadiation.pxp BBRadiation.ipf
First, you might check out the Planck distribution procedure at this link ...

http://www.igorexchange.com/project/PlanckDistributionDemo

Secondly, did you apply the mask when you did the fit the second time? I had no problems when I did.

Somewhere I have a procedure that fits Planck distributions to emission spectra. I might see if I can find it to share. In my experience, you may want to run the baseline out a bit further to the left, and if you can get more data to the right, you would have some improvement too.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
Thank you for the link and advice. Unfortunately, our setup does not allow us to acquire any data further to the "right". The link looks promising, I'll have to check it out.

I'm not quite sure what you mean about running the fit after selecting the data mask. I did what I had written above, created the data mask wave, then ran the fit and selecting that wave as the mask wave for the fit. I'll go back and try it again, we'll see what happens. (I DID select the mask wave in the curve fit dialogue, and I also physically looked at the mask wave, ensuring it was created correctly.)

EDIT: Regardless of what I do, no matter how small I make the polygons, the fit never converges. I think that the wide peak at 561 nm is a problem, I think it's too wide. If I cut off some of the peak but not all of it I bet the fit would converge. (It converges just fine if I don't bother with the wide peak, and only exclude the sharp peak at 550 nm. Mind you, the results are almost the same as if I hadn't bothered with a data mask at all, only a degree difference, however that spectrum was chosen specifically for it's lack of peaks.)

When I ran the fit originally, it gave me results that were "correct", as in they matched the data obtained from a thermocouple placed a mm or so away from the surface. In the data above, the results say 939 K (666 C) with a stdev of 15 or so when the thermocouple reported 675 C. The actual temperature falls within the range output from the fit, and honestly I expect the actual temperature to be a bit lower than what the thermocouple reports, IR imaging confirms this. Unfortunately we don't always have access to the IR camera (actually we almost never have access to it).

LATE EDIT: A bit off topic, but is there a way to "save" curve fits so that they always appear in the curve fit drop down box? It's really annoying having to recreate the fit every time I make a new file.)
I found the experiment file. It is standardized for the emission spectra that I had at the time. It could likely be modified for your use. Basically, it takes an emission spectrum, gives you the option to cut out regions (using the cursors rather than the polygon tool), fits the result to the Planck distribution, and shows the coefficients + residual wave. A figure that illustrates representative results is attached from the publication “Temperatures from Spectroscopic Studies of Hot Gas and Flame Fronts Observed During Blowback”, J. J. Weimer and I. L. Singer, IEEE Transactions on Plasma Science 39(1 part 1) (January 2011) 174. A figure that shows the fitting window is also attached. Let me know off-line from this forum if you would be interested in the procedure file.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
reepingk wrote:

I'm not quite sure what you mean about running the fit after selecting the data mask. I did what I had written above, created the data mask wave, then ran the fit and selecting that wave as the mask wave for the fit. I'll go back and try it again, we'll see what happens. (I DID select the mask wave in the curve fit dialogue, and I also physically looked at the mask wave, ensuring it was created correctly.) ...


My result is attached.

reepingk wrote:
LATE EDIT: A bit off topic, but is there a way to "save" curve fits so that they always appear in the curve fit drop down box? It's really annoying having to recreate the fit every time I make a new file.)


I think my procedure file has a Save button (and an Autosave checkbox) for this reason.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
Ah, your results are very similar to what igor was giving me when it told me "40 iterations without convergence". Notice the HUGE standard deviation values, as in values greater than the data point itself. (Aka 902K +- 1.66E4) Not exactly good results. Where as when it was converging it'd give me 940K +- 15. Interesting nonetheless. I'll have to play with it some more. Thank you for your help.

reepingk wrote:
... Notice the HUGE standard deviation values, as in values greater than the data point itself. (Aka 902K +- 1.66E4) Not exactly good results. ...


I think you have two problems. First, you still have small peaks in the baseline curve. Secondly, you have only a small portion of the Planck distribution, and it is just about the least important part in sensitivity to temperature relative to the other two parameters. In my experience, the best fits for temperature were ones where I could cover at least some portion of the curve with its inflection point in going from low to high energy. Also, I always did much better in fits for temperature even with the best of data by setting at least one of the two other parameters to be bounded fairly tightly, if not held constant. So, you might do better to fix the values of a or b in your fit equation as applied for your system, and then allow only T to vary. In essence, the convergence is broad around all three parameters, and, to get lower uncertainties, you have either to have more data or define one or two of the parameters tightly based on a prior information.

Also, do you know the transmission function of your system, and has it been removed before you fit the curve? Doing a fit to an uncalibrated curve (i.e. one that stills has the instrument transmission function in it) can make an enormous difference in the accuracy of the final temperature.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
When you see 40 iterations without convergence, you can tell Igor to allow more iterations by creating a global variable V_FitMaxIters and setting it to a large number. On the command line:

Variable V_FitMaxIters=1000 // a ridiculously large number

To read more about this and other variables that affect fitting:

DisplayHelpTopic "Special Variables for Curve Fitting"

The need for many iterations, plus large reported errors on the fit result are signs of what regression geeks call "identifiability" problems. It means that the data don't adequately constrain the fit. Here is my standard answer that I send when I get questions about this via tech support e-mail:
---------------------------------------
The large error range combined with needing a large number of iterations for convergence, are strongly suggestive of "identifiability" problems. That is, two or more of the fit coefficients trade off in a way that makes it nearly impossible to solve for the values of both at once. They are correlated in a way that if you adjust one, you can find a value of the other that makes a fit that is nearly as good. Identifiability problems are generally accompanied by large estimated errors on the fit coefficients.

When the correlation is too strong, the fit doesn't know where to go- it will wander around in a coefficient space where a broad range of points all seem about as good. The usual result is apparent convergence but with large estimated values in W_sigma, or a singular matrix error. The error estimates are based on the curvature of the chi-square surface around the solution point. A flat-bottomed chi-square surface such as results from have many solutions that are nearly as good will result in large errors.

You can diagnose identifiability problems after a successful fit by looking at the correlation matrix. To learn how to get it from an Igor fit execute this command on the command line:

DisplayHelpTopic "Correlation matrix"

It requires that you use the /M=2 flag with FuncFit. You can set that flag on the Output Options tab of the Curve Fitting dialog by checking the Covariance Matrix checkbox. In the correlation matrix, problems are indicated by off-diagonal values close to 1 or -1. By close, usually 0.9 is OK but could be problematic, 0.99 is poor, 0.999 is catastrophic.
---------------------------------------

I haven't actually tried your fit, but from what I read in Jeff Weimer's responses it sounds like you have not included a critical portion of the potential data range, where sensitivity to temperature is high. For instance, if the fit function has an inflection point, and the position of the inflection point is controlled by some fit coefficient, if you don't include the inflection point in your measurements then that coefficient won't be well constrained. Then you get into the situation described above where a wide range of solutions give results with very nearly the same chi-square.

This likely is why your masked fits don't work. Possibly including a peak in the data makes the fit think there's an inflection point in the data where there really isn't one. Removing the area around that peak reveals the fundamental problem with your fit and data range.

Quote:
LATE EDIT: A bit off topic, but is there a way to "save" curve fits so that they always appear in the curve fit drop down box? It's really annoying having to recreate the fit every time I make a new file.)


Move the code from the Procedure window to a new procedure window (Windows->New->Procedure Window). The File->Save Procedure As and navigate to the Igor User Files folder and save the procedure in the Igor Procedures folder. You can find the User Files folder by selecting Help->Show Igor Pro User Files. Now when you re-start Igor you will have that fit function always available. We need a way to make this easier...

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com