Some questions about Global Fit package

I am wondering whether it is necessary to declare the fitting function as threadsafe to take advantage of the multithread, multiprocessors feature?
like:
Threadsafe function function_fitting () : FitFunc
...
end
It is, but Global Fit wraps your fitting function in its own fitting function, and the wrapper is not declared Threadsafe. It shouldn't be a problem to declare your function Threadsafe, but you won't get a benefit from Global Fit.

... some time passes ....

Well, I actually tried it here using the Global Fit Demo, and you can't declare your fit function Threadsafe. I'm not quite sure why.

It does work, however, if you change Global Fit 2.ipf to declare the wrapper functions threadsafe, then you can have a threadsafe fit function. To do that you have to put the Threadsafe keyword in front of the functions GFFitFuncTemplate, GFFitAllAtOnceTemplate, NewGlblFitFunc and NewGlblFitFuncAllAtOnce. And then, of course, non-threadsafe fit functions *won't* work.

In order to make this work in a way that would support both threadsafe and non-threadsafe fit functions I would have to modify the FunctionInfo built-in function to include the threadsafeness of the function. This would probably be a good idea, but I don't have time right at the moment. I will keep it in mind, though.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
What I would probably do is calculate each subdataset in a separate IGOR thread, in the fit function wrapper (from a threadpool), rather than making the global fit wrapper threadsafe. I'm betting that would offer most speedup (most subdatasets are approximately the same length).
(Plus, the gencurvefit XOP can't thread fit functions, because of the non threadsafeness of the nature of CallFunction, so I wouldn't get any speedup if the entire wrapper was made threadsafe).
Thanks for the replies.

I have another question about Global Fit. I am wondering whether the only way to turn off the progress window is to set the global variable V_fitOptions = 4 under root:Packages:NewGlobalFit:.
I did the trials but Igor gave the warning that both GFFitFuncTemplate and GFFitAllAtOnceTemplate contain DoAlert operation which was yet available in ThreadSafe functions.
Sorry, I was talking more to John when I was suggesting threading.

John, do you prefer to make NewGlblFitFuncAllAtOnce threadsafe, or the loop inside it threaded? I prefer the 2nd option.

THe 2nd option would require more work, but not excessively so.
YHLien wrote:
I did the trials but Igor gave the warning that both GFFitFuncTemplate and GFFitAllAtOnceTemplate contain DoAlert operation which was yet available in ThreadSafe functions.

I apologize- when I tried it myself, I knew that DoAlert is not threadsafe and commented out those lines without thinking much about it. I should have mentioned that it was necessary to do that.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
YHLien wrote:
I am wondering whether the only way to turn off the progress window is to set the global variable V_fitOptions = 4 under root:Packages:NewGlobalFit:.

V_fitOptions will not affect the Global Fit progress window. It is just a graph window made by the procedures and has nothing to do with the regular Curve Fit Progress window. In the Global Fit control panel, you will find a checkbox, "Fit Progress Graph". That checkbox corresponds to the option NewGFOptionFIT_GRAPH that can be added to the DoNewGlobalFit function input Options.

The code that services that checkbox looks like this:
    ControlInfo/W=NewGlobalFitPanel#NewGF_GlobalControlArea NewGF_FitProgressGraphCheckBox
    if (V_value)
        curveFitOptions += NewGFOptionFIT_GRAPH
    endif


John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
andyfaff wrote:
John, do you prefer to make NewGlblFitFuncAllAtOnce threadsafe, or the loop inside it threaded? I prefer the 2nd option.

THe 2nd option would require more work, but not excessively so.

It's hard to say which would give a better boost to performance without trying it. Probably it would depend on the specific problem. Threading the guts of the fit function wrappers would give you more control over how it's done.

If you want to try it, go ahead!

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
johnweeks wrote:

If you want to try it, go ahead!


I don't think I want to go there just yet, but I don't think it would take too much effort. I'd only have a bash if it were eventually to find its way into the main version.
johnweeks wrote:
YHLien wrote:
I am wondering whether the only way to turn off the progress window is to set the global variable V_fitOptions = 4 under root:Packages:NewGlobalFit:.

V_fitOptions will not affect the Global Fit progress window. It is just a graph window made by the procedures and has nothing to do with the regular Curve Fit Progress window. In the Global Fit control panel, you will find a checkbox, "Fit Progress Graph". That checkbox corresponds to the option NewGFOptionFIT_GRAPH that can be added to the DoNewGlobalFit function input Options.


I noticed there was a data folder named "NewGlobalFit" within root:Packages: and it seemed to store some settings of Global Fit.

I set the global variable V_fitOptions=4 as the same thing done for typical fitting and the progress window was suppressed. I am wondering whether this scheme might have some side effects on fitting or not.
I am also wondering how I should do to speed up the global fitting?

Now two functions "Func_peaks_ch1" and "Func_peaks_ch2"
Function Func_peaks_ch1(Coefs, wave_y, wave_x) : FitFunc

    Wave Coefs, wave_y, wave_x

    wave_y = Coefs[0] + Coefs[14]*Func_profile(Coefs[1], Coefs[2], Coefs[3], Coefs[13], wave_x) + Func_profile(Coefs[4], Coefs[5], Coefs[6], Coefs[13], wave_x)
    wave_y += Coefs[14]*Func_profile(Coefs[7], Coefs[8], Coefs[9], Coefs[13], wave_x) + Func_profile(Coefs[10], Coefs[11], Coefs[12], Coefs[13], wave_x)

End

Function Func_peaks_ch2(Coefs, wave_y, wave_x) : FitFunc

    Wave Coefs, wave_y, wave_x

    wave_y = Coefs[0] + Func_profile(Coefs[1], Coefs[2], Coefs[3], Coefs[13], wave_x) + Coefs[14]*Func_profile(Coefs[4], Coefs[5], Coefs[6], Coefs[13], wave_x)
    wave_y += Func_profile(Coefs[7], Coefs[8], Coefs[9], Coefs[13], wave_x) + Coefs[14]*Func_profile(Coefs[10], Coefs[11], Coefs[12], Coefs[13], wave_x)

End

Function Func_profile(Height, Width, Center, Tau, t)

    Variable Height, Width, Center, Tau, t
   
    return -1.25331413732 * Height * Width / Tau * exp(0.5*(-2*Tau*t + 2*Center*Tau + Width^2) / Tau^2) * (-1 + erf(0.70710678119 * (-t*Tau + Tau*Center + Width^2) / (Width * Tau)))

End

are used to individually fit two sets of data, i.e., func 1 for data 1 and func 2. The Coefs[1..12] are shared in my case and Coefs[] totally contain 18 parameters. The number of data points within data 1 and data 2 are 1000, 1000, respectively. Now the global fit needs to take about 3 sec to finish the fitting.

However, I might have more 30,000 sets of data per day to fit and the fitting speed is really too slow for me.

Any comment or suggestion for optimization/speedup is deeply appreciated.
Have you already played around with the accuracy parameter of erf?
At least documentation says that the speed depends significantly on the desired accuracy.

And of course precomputing (calculating width^2 only once) should also help.

PS: The fastest solution would be writing an XOP but this is also the most difficult/time-consuming one.
andyfaff wrote:
I don't think I want to go there just yet, but I don't think it would take too much effort. I'd only have a bash if it were eventually to find its way into the main version.

I know your work. If you write it, it's hard to see why I wouldn't release it. I would be happy to give credit in the procedure file and help files, too!

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
YHLien wrote:
I set the global variable V_fitOptions=4 as the same thing done for typical fitting and the progress window was suppressed. I am wondering whether this scheme might have some side effects on fitting or not.

That V_fitOptions suppresses the regular curve fit progress window during a global fit. Since Global Fit uses FuncFit to do the actual fitting, it has to suppress that window. To suppress the progress graph that Global Fit makes, you need to set the Options input appropriately, as I mentioned previously.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
YHLien wrote:
are used to individually fit two sets of data, i.e., func 1 for data 1 and func 2. The Coefs[1..12] are shared in my case and Coefs[] totally contain 18 parameters. The number of data points within data 1 and data 2 are 1000, 1000, respectively. Now the global fit needs to take about 3 sec to finish the fitting.

As Thomas Braun says, pre-compute anything you can, like Width^2, Center*Tau. If you can eliminate the Func_profile function it might help. Since your wave assignments are simple, you could use Multithread for them. It would require that Func_profile be Threadsafe, but that shouldn't be a problem. You can use Multithread inside a non-threadsafe function.
Quote:
However, I might have more 30,000 sets of data per day to fit and the fitting speed is really too slow for me.

Really 30,000 data sets per day? At 3 seconds per data set, that's almost 28 hours. That is a problem, given the limits on the number of hours in a day:) Thomas Braun is also correct in saying that an XOP is likely the best way to get significant speed-up.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Just to see if there was any benefit to making globalfit threadsafe, I implemented the above suggestion in my copy of globalfit 2. It appears that only functions that internally use multithreading will benefit from such a modification. Is there any way to speed up the globalfit in general by parallel processing? I would really appreciate such a possibility as I am now using comptuers with 4-6 cores in my work and only see one processor core light up during fits. It seems like a lot of the labor that globalfit does (correct me if I'm wrong), could be done in parallel, for a much higher throughput.

I could be a bit off-base here, one reason being that I am using the genetic algorithm version of globalfit 2, which is a part of the motofit package, but it's not terribly different.

mtaylor wrote:
Just to see if there was any benefit to making globalfit threadsafe, I implemented the above suggestion in my copy of globalfit 2. It appears that only functions that internally use multithreading will benefit from such a modification. Is there any way to speed up the globalfit in general by parallel processing? I would really appreciate such a possibility as I am now using comptuers with 4-6 cores in my work and only see one processor core light up during fits. It seems like a lot of the labor that globalfit does (correct me if I'm wrong), could be done in parallel, for a much higher throughput.

The obvious thing would be for the wrapper fit functions to parcel out each data set to a thread. That would help only if the data sets are fairly large (on the order of thousands of points, depending on the computational load of the fit function). Using Multithread is a good way to get a speed-up if you can write your fit function so that the bulk of the computation happens in a wave assignment.
Quote:
I could be a bit off-base here, one reason being that I am using the genetic algorithm version of globalfit 2, which is a part of the motofit package, but it's not terribly different.

That is quite a bit different: MotoFit and the genetic fit algorithm replace the entire fitting engine with the genetic fit. If MotoFit's genetic fit is threadsafe, though, it would respond pretty similarly to the same sort of techniques that would help with the regular Global Fit.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
The version that I wrote only has bolt ons to the WM version, viz. I added a button to allow the user to simulate what happens with the current coefficients before the fit. Also, when one presses to the fit button one has a popup asking for a choice of the normal funcfit (the WM code is unchanged) or gencurvefit (bolted on code). Thus, if the global fit wrapper were to multithread in Funcfit you would see a parallel speedup. However, if one selects the gencurvefit option you wouldn't see the speedup because the XOP can't call the global fit wrapper (using CallFunction) from an XOP created thread, as CallFunction is not threadsafe.
However, if one where to thread inside the wrapper you would see a speedup.
BTW, there is a library called libgencurvefit (I wrote it for Gencurvefit) that WM could use if they wanted to include genetic optimisation inside Igor.
Thanks for the replies Andy and John. Because of the structure of my functions, I suppose it's a single-thread life for me for the near term.

Cheers!
Thanks the suggestions from John, Andy and Thomas.

I tried to combine the techniques you suggested, I mean pre-computing, mutithread and so on, and I improve the execution time from about 2.8 sec to 1.1 sec. I also noticed the average cpu load increased from 55% to 90% in my core 2 duo MacBook. However, as I moved the code to a 8-core i7 PC, I found only 3 cores were half-loaded and other cores stayed in very low cpu usage. Any suggestion to squeeze out the rest of cpu power?
YHLien wrote:

I tried to combine the techniques you suggested, I mean pre-computing, mutithread and so on, and I improve the execution time from about 2.8 sec to 1.1 sec. I also noticed the average cpu load increased from 55% to 90% in my core 2 duo MacBook. However, as I moved the code to a 8-core i7 PC, I found only 3 cores were half-loaded and other cores stayed in very low cpu usage. Any suggestion to squeeze out the rest of cpu power?


Don't be too hasty to make more threads. Remember that there is overhead in creating each thread - it takes time to set each thread up. This means when the number of points is low, then the overhead of creating the thread dominates. When you have a large number of points the calculation is the dominant proportion, the overhead for thread creation is minimal. Please see http://www.igorexchange.com/node/1518 for a demonstration. In your case your fitfunction is simple and won't take long to compute. Thus, I would think that only 2 or 3 (max) threads would provide optimal speed.
My idea was that one creates a thread pool at the before the fit starts. This reduces the threading overhead. One would create N threads in the pool, where N is the number of datasets. On entering the fitfunction wrapper there would be a reference to the thread pool. THe wrapper would then divvy up the calculation and post each dataset to an individual thread in the pool (using a freedatafolder). The thread would do the calculation. Once all the threads in the pool have finished you cleanup and return from the wrapper. Once the fit has finished, the threadpool is terminated.

1.1 secs doesn't sound too onerous (esp if you're using Gencurvefit), but if you need to churn thro' a lot of data I understand your concern. Just feel sorry for those of us whose fits take ~30mins with globalfit (for which threading would be a boon).
YHLien wrote:
I tried to combine the techniques you suggested, I mean pre-computing, mutithread and so on, and I improve the execution time from about 2.8 sec to 1.1 sec. I also noticed the average cpu load increased from 55% to 90% in my core 2 duo MacBook. However, as I moved the code to a 8-core i7 PC, I found only 3 cores were half-loaded and other cores stayed in very low cpu usage. Any suggestion to squeeze out the rest of cpu power?

At this point I can't add anything to what we have said without actually having your code and sample data sets. Furthermore, Andy's latest comments in response to this post are very good ones. His recommendation for threading the global fit wrapper functions are excellent, but involve pretty advanced Igor programming.

If you would like, package up your code and data sets and send it to support@wavemetrics.com. I will try to make the time to look it over. I would test it on my 8-core Mac Pro. With hyperthreading, Igor thinks I have 16 cores :)

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
Since Andy has updated gencurvefit to be threadsafe, I decided to try a little optimization on my own all-at-once functions and report back... Note that I am no longer using the global_fit dialog, but calling gencurvefit directly.

Speedup was found even for VERY simple functions, such as the one below:

Threadsafe Function ZrandlesW(w,complexcat,logfreq): fitfunc
    wave complexcat, w, logfreq //declare wave names.  Don't edit this
    variable/g midpointt
    variable/g/c ki //pull in complex constant
    variable Rs, Rp, Cdl,sigmaW //declare variable names; you can edit this
    wave/c complexer
   
    Rs=w[0] //assign your variable names to the solution wave.  Edit this.
    Cdl=w[1]
    Rp=w[2]
    sigmaW=w[3]

    Multithread complexer[0,midpointt-1] = Rs+1/(1/(Rp+sigmaW*((logfreq)^(-1/2)-ki*(logfreq)^(-1/2)))+1/(ki/-(Cdl*logfreq)))
    complexcat[0,midpointt-1]=real(complexer[p])
    complexcat[midpointt, ]=imag(complexer[p-midpointt])
End


I am seeing a 25% speedup in terms of the whole curve fit process, but a 200-300% increase in cpu resources (via windows task manager) . I am using a 6-core hyperthreaded i7 processor. My data set is 142 points. As a bonus, I am getting a little error message at the end of the fit, indicating that the wave is out of range... I have concluded that for my functions that there is more utility in calling multiple fits in parallel than parallelizing the function (although this may change with a more complicated function and certainly will change with much larger data sets).
Any speed up of Gencurvefit is not a result of it being threadsafe. The only benefit to having it threadsafe is that you can call it from different Igor threads (lots of fits at the same time). There aren't several function evaluations at the same time.
andyfaff wrote:
Any speed up of Gencurvefit is not a result of it being threadsafe. The only benefit to having it threadsafe is that you can call it from different Igor threads (lots of fits at the same time). There aren't several function evaluations at the same time.

Perhaps he was referring to the use of Multithread in his fit function. But I that doesn't require making the whole function threadsafe.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com