SSA in IGOR

Hello,

I'm trying to find a way to perform singular spectrum analysis in IGOR. I have a large number (thousands) of time series to analyze, so speed is pretty important. The SSA code that I've written seems to work on some example cases, but it's significantly slowing down my analysis when I try to apply it for analyzing many time series. I was wondering if there is a built-in function for SSA or if anyone has ideas about how I can make this code run faster?


function ssa(timeseries,timeseries_rc,wlength,nrc)
wave timeseries,timeseries_rc // input: timeseries; output: reconstructed timeseries_rc
variable wlength // specific window length for embedding dimension
variable nrc // number of components to use in reconstruction
variable N, mm
mm=0
N=numpnts(timeseries)

MatrixOP/o matY=zeroMat(N-wlength+1,wlength,4)
make/o/n=(N-wlength+1) currwindow

do
currwindow =timeseries[p+mm]
matY[][mm]=currwindow[p]
mm+=1
while(mm<(wlength))

// calculate covariance matrix (trajectory approach)
MatrixOp/o Cemb=(matY^t)x(matY)/(N-wlength+1)

// Calculate eigenvalues and eigenvectors
MatrixEigenV/o/Sym/Evec Cemb
//MatrixSchur Cemb
Wave W_eigenvalues,M_eigenvectors

//Calculate principal components PC
MatrixOp/o pc=(matY)x(M_eigenvectors)

// Calculate reconstructed components RC
timeseries_rc=0.0
MatrixOP/o RC=zeroMat(N,wlength,4)
variable nn=1
string rcname
mm=0
make/o/n=(N) rcvals
do
MatrixOp/o pccol=col(PC,mm)
MatrixOp/o rhocol=col(M_eigenvectors,mm)
MatrixOp/o buf=(pccol)x(rhocol^t)

MatrixOp/o bufr=reverseRows(buf)
nn=1
do
MatrixOp/o bufdiag=getDiag(bufr,-(N-wlength+1)+nn)
RC[nn-1][mm]=mean(bufdiag)
nn+=1
while(nn rcname="rc"+num2str(mm)
rcvals=RC[p][mm]
duplicate/o rcvals,$rcname
if (mm>(wlength-nrc-1))
timeseries_rc+=rcvals
endif

mm+=1
while(mm

end
I don't know anything about singular spectrum analysis, but I might still be able to give you a few hints to speed up your code.

First of all check each section of your code to figure out what is slowing you down. You can do this using
StartMSTimer/StopMSTimer()
:

Variable t=StartMSTimer
(code subsection)
Print(StopMSTimer(t))


My guess is that your first while loop is very slow:

do
	currwindow =timeseries[p+mm]
	matY[][mm]=currwindow[p]	
	mm+=1
while(mm<(wlength))


To me it looks like your are combining 1D subsections of timeseries to generate a 2D matrix matY. You could probably speed this up using
Duplicate
and
Imagetransform putCol
. Something along the lines of:

do
	Duplicate/O/R=[mm, mm+wlength] timeseries, currwindow
        ImageTransform /D=currwindow/G=(mm) putCol matY                 //	My syntax might be slightly wrong, but it's something like that
	mm+=1
while(mm<(wlength))


Using wave assignments with p, q and r are usually very slow. Whenever you have the option to use
MatrixOP, FastOP or Imagetransform
do that instead

Another option would be to use
MatrixOP WaveMap
and skip the while loop alltogether. Especially if you can use the same map for multiple time series this could be quite fast. You should also look into
MultiThread MyWave[]=xxxx
and
ThreadSafe
functions
You can multithread your function like shown below, but make sure you are using free waves or waves with unique names inside your function or it will not be threadsafe.


ThreadSafe Function MyFunction(InputWave)
Wave InputWave
....
End

Function RunAll()
    
        //	Counts the number of waves in the folder
        DFREF Folder=xxxx
	Variable n=CountObjectsDFR(Folder, 1)

	//	Creates a list of all waves in the folder
	Make/FREE/O/WAVE/N=(n) AllWaves=WaveRefIndexedDFR(Folder, p)

       //      Runs MyFunction for each wave in the folder. Using a dummy wave like this is often easier than creating a threadgroup
       Make/FREE/O/N=(n) DummyWave
       MultiThread DummyWave[]=MyFunction(AllWaves[p])
End
I've found that the Igor function profiler is pretty useful for finding bottlenecks in the code, especially for longer functions. If you add
#include <FunctionProfiling>

to the procedure window and then open the FunctionProfiling ipf, there are instructions for running the profiler. There's more info here: http://www.igorexchange.com/project/FuncProfiling
I managed to speed up the second do-while loop in the code from ~0.15 s to 0.07 s (which seemed to be the slowest part) by replacing one of the do while loops with a MultiThread function, but I'm now I'm getting an error "While executing MatrixOP, the following error occurred: Bad MatrixOPs token" even though the code seems to be working. Is there any reason why I can't use a MatrixOp call within the Threadsafe function?


function ssa_multi(timeseries,timeseries_rc,wlength,nrc)
wave timeseries,timeseries_rc	// input: timeseries; output: reconstructed timeseries_rc
variable wlength	// specific window length for embedding dimension
variable nrc 		// number of components to use in reconstruction
variable N
variable mm
mm=0
N=numpnts(timeseries)

MatrixOP/o matY=zeroMat(N-wlength+1,wlength,4)
make/o/n=(N-wlength+1) currwindow

do
	currwindow =timeseries[p+mm]
	matY[][mm]=currwindow[p]	
	mm+=1
while(mm<(wlength))

// calculate covariance matrix (trajectory approach)
MatrixOp/o Cemb=(matY^t)x(matY)/(N-wlength+1)

// Calculate eigenvalues and eigenvectors
MatrixEigenV/o/Sym/Evec Cemb
//MatrixSchur Cemb
Wave W_eigenvalues,M_eigenvectors

//Calculate principal components PC
MatrixOp/o pc=(matY)x(M_eigenvectors)

// Calculate reconstructed components RC
timeseries_rc=0.0
mm=0
make/o/n=(N) rcvals
variable ts=startmstimer

do
	
	MatrixOp/o/FREE bufr=reverseRows((col(PC,mm))x(col(M_eigenvectors,mm)^t))
	Make/o/FREE/n=(N) bufd,axis=p
		
	MultiThread bufd=reconstruct_multi(bufr,N,wlength,axis[p])
	rcvals=bufd[N-p]
	if (mm>(wlength-nrc-1))
		timeseries_rc+=rcvals
	endif

	mm+=1
while(mm
You have to make bufdiag a free wave:
MatrixOp/o/FREE bufdiag=getDiag(bufr,-(N-wlength+1)+nn)
. Otherwise the parallel instances of reconstruct_multi will all work on the same bufdiag wave and overwrite each other.

You might consider multithreading your parent function ssa_multi instead. That should be more effective than only multithreading a subsection of it.