Upgrades to various XOP's - easyHttp, Abeles, XMLutils, Gencurvefit and SOCKIT

the following changes have been made to several XOP's:

easyHttp: (URL manipulation: http, https, sftp, scp, ftp, ldap, dict, POST/GET)
upgraded to use toolkit 6
now threadsafe (multiple downloads at once)
64 bit windows version

Abeles:
upgraded to use toolkit 6
now threadsafe
64 bit windows version

XMLutils (XML file handling)
upgraded to use toolkit 6
(soon to have 64 bit windows version)

Gencurvefit (genetic optimisation)
upgraded to use toolkit 6
now threadsafe (separate fits from different Igor threads)
64 bit windows version

SOCKIT (TCP client)
upgraded to use toolkit 6
SOCKITstringtowave now threadsafe, can specify /DEST and /FREE flags.
SOCKITwavetostring now threadsafe
several of the operations/functions in this XOP are already threadsafe.
64 bit windows version
Brilliant as ever.


One little note: I am getting a startup caution in both the 32-bit and 64-bit version of igor that says the following:

Igor Pro wants you to know...

You passed kOperationIsThreadSafe flag to RegisterOperation but did not set the thread safe bit in the command category: GenCurveFit.

It doesn't appear to affect the operation of any of my code that relies on gencurvefit, although I would like to do multi-threaded work in the near future.
fixed.

Bear in mind that only the operation is threadsafe, so you can call it from IGOR threads. Each individual operation will not finish any faster.
Hi Andy, thanks for the great work.
I have two questions:
1. gencurvefit seems to force an autoupdate every time it is executed. Normally i use "pauseupdate" and "silent 1" to excell fit operations of thousands of data. This does not work with gencurvefit. Every time it is executed it updates all graphs
and also causes repeated updates in the Data-browser window. I already use the /N and /Q option but without success. OP Windows 7 64bit - I only tested your 32bit XOP though.
2. Is there a way of implementing Maximum Likelihood estimates with a userdefined cost function. I am no expert but tried already to implement LS as mentioned in your help file but it does not work. Since one cannot use the option /MINF=my_costfunc to pass variables and waves with the function call, I don't know how to implement it correctly. Here is my code so far for LS (which does not work - the implementation without /MINF and METH=0 (LS) does work
though:

Gencurvefit/N=1/Q/X=bval /METH=0 /MINF=my_costfunction /STGY=0 Double_exp,Sig,coefs_gen,"00000",limits_gen

Function My_costfunction(coefs, y_obs, y_calc, s_obs)
Wave coefs, y_obs, y_calc, s_obs

wave coefs_gen=coefs_gen
wave Sig=Sig
wave bval

y_calc=double_exp(coefs_gen,bval)

coefs=coefs_gen

make/D/N=(numpnts(y_obs)) /free diff
diff=((y_obs-y_calc)/s_obs)^2
return sum(diff)
END

Cheers,
Boogie
Dear Boogie, thanks for the feedback.

bmaedler wrote:

I have two questions:
1. gencurvefit seems to force an autoupdate every time it is executed. Normally i use "pauseupdate" and "silent 1" to excell fit operations of thousands of data. This does not work with gencurvefit. Every time it is executed it updates all graphs
and also causes repeated updates in the Data-browser window. I already use the /N and /Q option but without success. OP Windows 7 64bit - I only tested your 32bit XOP though.


All the /Q flag does is prevent stuff being printed in the history.
I've just had a look at the /N flag. From inspection of the code the behaviour should be that no windows should update whilst in the minimisation loop (the majority of the time). However, during setup and cleanup I do ask for windows to be updated to provide a final view of the fit, and coefficients, etc. PauseUpdate and silent 1 do not have any effect in user functions, only macros.

bmaedler wrote:

2. Is there a way of implementing Maximum Likelihood estimates with a userdefined cost function. I am no expert but tried already to implement LS as mentioned in your help file but it does not work. Since one cannot use the option /MINF=my_costfunc to pass variables and waves with the function call, I don't know how to implement it correctly. Here is my code so far for LS (which does not work - the implementation without /MINF and METH=0 (LS).


You should not calculate the fitfunction in the cost function, only the energy. coefs are the coefficients, y_obs are the dependent variable, y_calc the value from the fit function and s_obs the standard deviation. Here is the correct cost function for Chi2:

Function My_costfunction(coefs, y_obs, y_calc, s_obs)
Wave coefs, y_obs, y_calc, s_obs

duplicate/free y_obs, diff
multithread diff = ((diff - y_calc) / s_obs)^2
return sum(diff)
end
Hi Andy,
thanks for the reply. I have attached a minimal version of my code running gencurvefit and you will see that it updates the graphs every single execution. Just start the macro MC(100,1000)

As for the costfunction problem. I understand that its purpose is calculating the "energy" - my problem is how do I use a userdefined cost function while calling from gencurvefit? Presumably one cannot pass waves with "my_costfunction".
In the example below, how would I pass the observables let's say my created wave "Sig" to "y_obs" and "coefs_gen" to "coefs" when calling gencurvefit? Did I make myself clear?

Macro Call_fit()
...
...
Gencurvefit/N=1/Q/X=bval /METH=0 /MINF=My_costfunction /STGY=0 Double_exp,Sig,coefs_gen,"00000",limits_gen
END

Function My_costfunction(coefs, y_obs, y_calc, s_obs)
Wave coefs, y_obs, y_calc, s_obs
duplicate/free y_obs, diff
multithread diff = ((diff - y_calc) / s_obs)^2
return sum(diff)
end


Cheers, Boogie

gencurvefit_simu.pxp
bmaedler wrote:
Hi Andy,
thanks for the reply. I have attached a minimal version of my code running gencurvefit and you will see that it updates the graphs every single execution. Just start the macro MC(100,1000)

As for the costfunction problem. I understand that its purpose is calculating the "energy" - my problem is how do I use a userdefined cost function while calling from gencurvefit? Presumably one cannot pass waves with "my_costfunction".
In the example below, how would I pass the observables let's say my created wave "Sig" to "y_obs" and "coefs_gen" to "coefs" when calling gencurvefit? Did I make myself clear?


Dear Boogie,
I've examined the experiment and it's not the gencurvefit XOP that are updating the graphs in it. The graph updates occur when you change the values after the curvefitting operation.
Also for future work you should try to write all your code in functions, rather than macros. They are compiled and go a lot faster.

I think I understand where your misconception lies with respect to the cost function. You don't need to pass the observables, nor the coefficients. The Gencurvefit XOP is responsible for supplying the coefs, y_obs, y_calc and s_obs to this function. You do not need to source them. As the coefficients evolve the XOP has to supply changing values. These changing values are supplied by the XOP to the costfunction. A similar explanation applies to the other waves.

Thanks Andy for the reply but I have to disagree. It is not the assignment of waves after the gencurvefit that will update the windows. That is the purpose of using "Pauseupdate" and "Silent 1" in Macros. I know what John Weeks says about using Macros but in my experience I have not seen any speed difference by using Macros instead of Functions (not if one omits all updates). But aside from this, I have attached a small modification to the example experiment that now contains also the IGOR inbuilt CurveFit for double exp. You will find that if you comment your gencurvefit out it will only execute the Curvefit and NOT update any graphs or the data browser window. So it must be the way you programmed this XOP.
I mean in the end it is no big deal but I would be interested in how much faster your XOP would run with preventing all updates (particular the flashing of the data browser window concerns me. I don't know what it could possibly update there every single time.

Thanks for your patients.

Cheers,
Burkhard.
Function test()

    Variable i, dummy=0, etime
    Variable tref = StartMSTimer
    for (i = 0; i < 1e5; i += 1)
        dummy += 1
    endfor
    etime = StopMSTimer(tref)
    print "Function: ", etime
end

Proc testp()
    Silent 1
    PauseUpdate
   
    Variable i=0, dummy=0, etime
    Variable tref = StartMSTimer
    do
        dummy += 1
        i += 1
    while(i < 1e5)
    etime = StopMSTimer(tref)
    print "Proc: ", etime
end

•test()
Function: 10505.4
•testp()
Proc: 6.96227e+06

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
bmaedler wrote:

I mean in the end it is no big deal but I would be interested in how much faster your XOP would run with preventing all updates (particular the flashing of the data browser window concerns me. I don't know what it could possibly update there every single time.


The data browser updates itself when it needs to, and that is typically because a wave has changed. The GenCurvefit XOP has nothing to do with this (other than possibly actually changing the wave).

In some cases closing the data browser before doing analysis that can be slow will result in a performance improvement. You can close the data browser using ModifyBrowser close. If you do this from a function, you'll need to use Execute "ModifyBrowser close"

You can execute DisplayHelpTopic "Comparing Macros And Functions" for a comparison of functions and macros. There are reasons that we (WaveMetrics) recommend that you use functions.
Thanks John and Adam for your comments.

John, I can see that execution of loops that do not really calculate anything perform much faster when they are programmed as a function
compared to a macro. For my purpose that involves time consuming calculations for every loop step it did not matter but I will respect your recommendation. The anoying part sometimes with functions is the access to external waves that need to be re-defined for function calls.

Now , can any of you answer while, when you execute the example I attached, CurveFit does not cause any update of graphs and data browser window while gencurvefit does. The calculations of involved waves and variables are exactly the same in both scenarios.
I don't want to close all graphs every time I am using gencurve since I want to investigate the influence of noise, its distribution and the choice of start values on the performance of both fitting routines for real-time demonstration purposes.
Well, of course if you have a very time-consuming analysis that spends all its time in built-in code (or XOP code) then the overhead of whatever code it appears in, whether macro or function, will not be a large percentage. But that is a special case.

As for the inconvenience of accessing global objects (waves, global variables) from functions, it is the price for compile-time checking. It also provides a bit of the type-safety of C or C++. For errors that can be caught by the compiler, you get *all* your code checked by the compiler. In a macro only the lines that actually execute are checked for correct syntax. I've seen macros that have had bad syntax for years because the bad commands were in seldom-used code.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com
John,
I've looked at the example experiment that Boogie posted. I understand what he means now. As the loop runs he is extracting fitted parameters and putting them in waves that are displayed in a graph. If he is not using Gencurvefit the loop runs just the Curvefit operation, but the displayed waves don't update until the end. When Gencurvefit runs the displayed waves update every loop (the Data Browser doesn't seem to flicker for me, I6.23+Snow Lion).
He doesn't have the /N flag for Curvefit.

Is this observed behaviour due to the fact that I call WaveHandleModified on modified waves as I exit the XOP? Or can Curvefit pick up on the fact that the macro isn't supposed to be updating waves? There are 3 graphs in his experiment, only one of the graphs contains a wave that I modify, but all 3 appear to update in the loop.

I thought it was good citizenship to mark the waves that you've modified as updated.

A.
Hi Andy & John,
thanks Andy for clarifying my problem.

I have attached a new version converted the Macro to Functions and yes indeed there is a perfomance gain (x2) for CurveFit - thanks John.

I also included now a performance measure with the GenCurveFit to compare speed with displaying all graphs and DataBrowser and without.
When I close all windows GenCurveFit runs almost x1.5 as fast, so if I could preventing that it updates the active waves in every single step of the loop it would benefit from an improved performance - the only question: how can I do this without closing all my graph windows?

Performance comparison for 1000 noise realizations:
with graphs and data browser update: 75 s
with all windows closed: 50.6 s
keep only the data browser open: 62.4 s

this is still much slowlier than executing CurveFit only: 0.46s
but I guess this is due to the more complex implementation of the genetic algorithm?

Thanks again guys for helping out here and sorry to be such a pain ;-)
Boogie
Genetic optimisation inherently is much more slower than Levenberg Marquardt. This is because it requires hundreds, if not thousands, if not tens of thousands of function evaluations (depending on complexity). The Levenberg Marquardt method only requires tens of function evaluations because it uses gradient approaches. Even then, sometimes it can require less if analytic derivatives of the fit function are known (which I am guessing they are for a double exponential). This easily accounts for the time difference.

In your situation the each fit is going quite quickly. I think you were doing 1000 fits, which equates to 0.075s per fit. In comparison I regularly do fits which take over 10 minutes each. However, Levenberg Marquardt is next to useless in those cases, so the time penalty is worth it.

If you really do need big speedups for this, and you have time to burn, then you should realise that Gencurvefit is threadsafe. Thus, if you have a quad core machine you could do 4 different fits simultaneously. This will require you to learn how to use threads. In such a situation you could cut the time down to 19s for a quad core machine (75/4). But this requires a large amount of dedication. (I have code that does this).


I have just realised that you are implementing your fitting function as a point by point fitfunc, rather than at all at once fitfunc. In the former Gencurvefit has to call IGOR to calculate the fitfunc point by point. This adds a lot of overhead. If you implement as an all-at-once function you will get appreciable speedups. I don't think you will see a massive speed up for this experiment, because you only have 10 points. If you had 100 points it would be a different outcome.
The following is code for an all-at-once function
Function liner(w, yy, xx):fitfunc
Wave w, yy,xx
yy = w[0] + w[1] * xx
End
I actually have a (maybe stupid) question about gencurvefit. Without having looked too much into the functionality, I wonder if one could join forces with multipeakfit. I have two peaks to fit which are kind of fiddly and dependent on starting values, so a solution which converges more reliable sound really nice. I think along the lines of dirty hacking the fitting function of multipeakfit to replace funcfit with gencurvefit, but maybe it wouldn't be so easy. So... would it be possible?
Quote:
Is this observed behaviour due to the fact that I call WaveHandleModified on modified waves as I exit the XOP?


No. Marking the wave as modified does not cause an update.

Calling XOPCommand or any of its cousins (XOPCommand2, XOPCommand3, XOPSilentCommand) from an XOP does cause an update and there is no way to prevent it.
hrodstein wrote:


Calling XOPCommand or any of its cousins (XOPCommand2, XOPCommand3, XOPSilentCommand) from an XOP does cause an update and there is no way to prevent it.


That would do it then. I tried to make gencurvefit as similar to curvefit as possible. Curvefit appends the fit to the top graph if the data is displayed there. Gencurvefit does the same. In order to do this I need to ask IGOR if the data is displayed in the top graph, and if it is, append it. The only way to do this is by using XOPSilentCommand. There are potentially 4 of those calls during a single call to the XOP. Each of those will cause an update of the windows.

Since each of those XOPSilentCommand calls can only be made from the main thread you won't get those continuous updates if you do the fitting in a threaded manner. This will achieve what Boogie wants, and would be NPROCESSORS times faster for the entire function (assuming that each fit is independent).
chozo wrote:
I actually have a (maybe stupid) question about gencurvefit. Without having looked too much into the functionality, I wonder if one could join forces with multipeakfit. I have two peaks to fit which are kind of fiddly and dependent on starting values, so a solution which converges more reliable sound really nice. I think along the lines of dirty hacking the fitting function of multipeakfit to replace funcfit with gencurvefit, but maybe it wouldn't be so easy. So... would it be possible?


I tried to make Gencurvefit as compatible with funcfit as possible. (it doesn't do a variety of things, such as the addition of multiple fitfunctions). This enabled me to amend the IGOR "Global Fit 2" to make a gencurvefit option for the fitting process, instead of just Funcfit. This is available from the gencurvefit package. It works well. However, I consider this quite fragile from a development point of view. It was quite a lot of work (aka dirty hacking) and it is susceptible to changes that WM make to the original procedure. Everytime a change is made i have to try and make similar changes to the amended version.

Such a process could undoubtedly be applied to multipeakfit. I cannot see a reason why it wouldn't work. However, I don't have the time to do it. To make such a change really robust WM would have to incorporate it themselves. But in order for them to do that there would be several issues concerning how the gencurvefit XOP was distributed, it's output, etc.

In an ideal fantasy world multipeakfit and global fit 2 could present a programming interface to allow people to use their own fitting methodology. Such an interface would present a standardised way of how a curvefitting function should appear to the outside. Then a whole range of possibilities would present themselves. This approach would probably require an immense amount of work, but one can dream.
Quote:
The only way to do this is by using XOPSilentCommand. There are potentially 4 of those calls during a single call to the XOP. Each of those will cause an update of the windows.

You may be able to prevent the update using the PauseUpdate callback. I think it should work but I'm not sure. You would then call ResumeUpdate to restore the original update state.
Ok, so there is an OSX version of Gencurvefit available now that doesn't have that updating behaviour. It was a fix, and what was happening with the updating windows was not the correct behaviour expected when the /N flag is specified. You will find that the whole thing completes in approx 2/3 the time now.

I don't know when a windows version will be available.

A.
Thanks for the answer. As the fit function in multipeakfit seems manageable I will try it when I have more time again.
I have a look at the multipeak fitting code.

The biggest issues are that:
1) Funcfit can fit "Sums of Fit Functions" (look at that entry in the Funcfit help). This is where the overall fit curve is created by summing the output of a list of fitfunctions, which have an associated coefficient wave. Gencurvefit can't do that. You would have to write a wrapper function that could unravel that, and present a single fitfunction to Gencurvefit.
Writing this wrapper function is complicated because you need to recreate the inbuilt curvefitting functions and may have to deal with a mixture of all-at-once and normal fitfunctions).

2) Gencurvefit needs to be supplied with a wave containing the lower and upper limits for the parameters. You would need to find some way of obtaining that. The geneticoptimsation.ipf file has a function for doing that, it may be suitable.
Hi Andy - me again.
After the "update" /N=1 problem is fixed I am coming back to my earlier question about how to implement user defined cost functions.

andyfaff wrote:
bmaedler wrote:
Hi Andy,
As for the costfunction problem. I understand that its purpose is calculating the "energy" - my problem is how do I use a userdefined cost function while calling from gencurvefit? Presumably one cannot pass waves with "my_costfunction".
In the example below, how would I pass the observables let's say my created wave "Sig" to "y_obs" and "coefs_gen" to "coefs" when calling gencurvefit? Did I make myself clear?


Dear Boogie, ...

I think I understand where your misconception lies with respect to the cost function. You don't need to pass the observables, nor the coefficients. The Gencurvefit XOP is responsible for supplying the coefs, y_obs, y_calc and s_obs to this function. You do not need to source them. As the coefficients evolve the XOP has to supply changing values. These changing values are supplied by the XOP to the costfunction. A similar explanation applies to the other waves.


I have tried to start with LS just the way you mentioned it but I won't get any sensible results. So would I call the user defined cost function with GenCurveFit like this?

Function Fit_LS_MyCostfunction()
V_FitError=0
coefs_gen={0,1000,0.001,100,0.1}
Gencurvefit/N=0/Q/X=bval /MINF=My_costfunction Double_exp,Sig,coefs_gen,"00000",limits_gen
END

Function My_costfunction(coefs, y_obs, y_calc, s_obs)
Wave coefs, y_obs, y_calc, s_obs

duplicate/free y_obs, diff
multithread diff=((diff-y_calc)/s_obs)^2
return sum(diff)
END


This does not work at you can see in the attached example. You said that I don't need to pass the essential waves to My_costfunction since it finds them automatically. But something is still not correct here. Any help would be appreciated.

Boogie
Boogie,
I did find a bug. It's now fixed. The user defined costfunction only worked for all-at-once fit functions. This is how I normally write fit functions. It should now work for your experiment.

A.
Hi Andy,
It works - splendid. Thank you very much for all your efforts, especially the always enormously quick response times. I will make myself knowledgable about the all-in-once function implementation. I read about it in some posts but never had an idea what they ment :-(
Cheers to Down Under - Boogie.
BM,
thank you for persisting with your questions, I wouldn't have paid as much attention otherwise.

A.

Hi.  I found your XMLUtils very helpful in a project.  But I forgot that there was a reason to avoid Catalina.  Now when I start Igor, I get a warning that XMLUtils isn't allowed.  Do you have any plans to get it compatible (i.e. notarized) ?  I'm self-conscious about asking, because my own XOPs are probably all still 32 bit.

Yes, it's a lot of effort. If you compile yourself you may find that they work without notarisation, but I would suggest that you follow Howard's suggestion first.

Well yes, notarization is intially some work. But once you've set it up, it's just another step in the build process. I just did that for our JSON XOP.