SSA in IGOR

Hello,

I'm trying to find a way to perform singular spectrum analysis in IGOR. I have a large number (thousands) of time series to analyze, so speed is pretty important. The SSA code that I've written seems to work on some example cases, but it's significantly slowing down my analysis when I try to apply it for analyzing many time series. I was wondering if there is a built-in function for SSA or if anyone has ideas about how I can make this code run faster?


function ssa(timeseries,timeseries_rc,wlength,nrc)
wave timeseries,timeseries_rc // input: timeseries; output: reconstructed timeseries_rc
variable wlength // specific window length for embedding dimension
variable nrc // number of components to use in reconstruction
variable N, mm
mm=0
N=numpnts(timeseries)

MatrixOP/o matY=zeroMat(N-wlength+1,wlength,4)
make/o/n=(N-wlength+1) currwindow

do
currwindow =timeseries[p+mm]
matY[][mm]=currwindow[p]
mm+=1
while(mm<(wlength))

// calculate covariance matrix (trajectory approach)
MatrixOp/o Cemb=(matY^t)x(matY)/(N-wlength+1)

// Calculate eigenvalues and eigenvectors
MatrixEigenV/o/Sym/Evec Cemb
//MatrixSchur Cemb
Wave W_eigenvalues,M_eigenvectors

//Calculate principal components PC
MatrixOp/o pc=(matY)x(M_eigenvectors)

// Calculate reconstructed components RC
timeseries_rc=0.0
MatrixOP/o RC=zeroMat(N,wlength,4)
variable nn=1
string rcname
mm=0
make/o/n=(N) rcvals
do
MatrixOp/o pccol=col(PC,mm)
MatrixOp/o rhocol=col(M_eigenvectors,mm)
MatrixOp/o buf=(pccol)x(rhocol^t)

MatrixOp/o bufr=reverseRows(buf)
nn=1
do
MatrixOp/o bufdiag=getDiag(bufr,-(N-wlength+1)+nn)
RC[nn-1][mm]=mean(bufdiag)
nn+=1
while(nn rcname="rc"+num2str(mm)
rcvals=RC[p][mm]
duplicate/o rcvals,$rcname
if (mm>(wlength-nrc-1))
timeseries_rc+=rcvals
endif

mm+=1
while(mm

end
I don't know anything about singular spectrum analysis, but I might still be able to give you a few hints to speed up your code.

First of all check each section of your code to figure out what is slowing you down. You can do this using StartMSTimer/StopMSTimer():
Variable t=StartMSTimer
(code subsection)
Print(StopMSTimer(t))


My guess is that your first while loop is very slow:
do
    currwindow =timeseries[p+mm]
    matY[][mm]=currwindow[p]   
    mm+=1
while(mm<(wlength))


To me it looks like your are combining 1D subsections of timeseries to generate a 2D matrix matY. You could probably speed this up using Duplicate and Imagetransform putCol. Something along the lines of:
do
    Duplicate/O/R=[mm, mm+wlength] timeseries, currwindow
        ImageTransform /D=currwindow/G=(mm) putCol matY                 //  My syntax might be slightly wrong, but it's something like that
    mm+=1
while(mm<(wlength))


Using wave assignments with p, q and r are usually very slow. Whenever you have the option to use MatrixOP, FastOP or Imagetransform do that instead

Another option would be to use MatrixOP WaveMap and skip the while loop alltogether. Especially if you can use the same map for multiple time series this could be quite fast. You should also look into MultiThread MyWave[]=xxxx and ThreadSafe functions
You can multithread your function like shown below, but make sure you are using free waves or waves with unique names inside your function or it will not be threadsafe.

ThreadSafe Function MyFunction(InputWave)
Wave InputWave
....
End

Function RunAll()
   
        //  Counts the number of waves in the folder
        DFREF Folder=xxxx
    Variable n=CountObjectsDFR(Folder, 1)

    //  Creates a list of all waves in the folder
    Make/FREE/O/WAVE/N=(n) AllWaves=WaveRefIndexedDFR(Folder, p)

       //      Runs MyFunction for each wave in the folder. Using a dummy wave like this is often easier than creating a threadgroup
       Make/FREE/O/N=(n) DummyWave
       MultiThread DummyWave[]=MyFunction(AllWaves[p])
End
I've found that the Igor function profiler is pretty useful for finding bottlenecks in the code, especially for longer functions. If you add
#include <FunctionProfiling>
to the procedure window and then open the FunctionProfiling ipf, there are instructions for running the profiler. There's more info here: http://www.igorexchange.com/project/FuncProfiling
I managed to speed up the second do-while loop in the code from ~0.15 s to 0.07 s (which seemed to be the slowest part) by replacing one of the do while loops with a MultiThread function, but I'm now I'm getting an error "While executing MatrixOP, the following error occurred: Bad MatrixOPs token" even though the code seems to be working. Is there any reason why I can't use a MatrixOp call within the Threadsafe function?

function ssa_multi(timeseries,timeseries_rc,wlength,nrc)
wave timeseries,timeseries_rc   // input: timeseries; output: reconstructed timeseries_rc
variable wlength    // specific window length for embedding dimension
variable nrc        // number of components to use in reconstruction
variable N
variable mm
mm=0
N=numpnts(timeseries)

MatrixOP/o matY=zeroMat(N-wlength+1,wlength,4)
make/o/n=(N-wlength+1) currwindow

do
    currwindow =timeseries[p+mm]
    matY[][mm]=currwindow[p]   
    mm+=1
while(mm<(wlength))

// calculate covariance matrix (trajectory approach)
MatrixOp/o Cemb=(matY^t)x(matY)/(N-wlength+1)

// Calculate eigenvalues and eigenvectors
MatrixEigenV/o/Sym/Evec Cemb
//MatrixSchur Cemb
Wave W_eigenvalues,M_eigenvectors

//Calculate principal components PC
MatrixOp/o pc=(matY)x(M_eigenvectors)

// Calculate reconstructed components RC
timeseries_rc=0.0
mm=0
make/o/n=(N) rcvals
variable ts=startmstimer

do
   
    MatrixOp/o/FREE bufr=reverseRows((col(PC,mm))x(col(M_eigenvectors,mm)^t))
    Make/o/FREE/n=(N) bufd,axis=p
       
    MultiThread bufd=reconstruct_multi(bufr,N,wlength,axis[p])
    rcvals=bufd[N-p]
    if (mm>(wlength-nrc-1))
        timeseries_rc+=rcvals
    endif

    mm+=1
while(mm<wlength)
print(stopmstimer(ts))
end
threadsafe function reconstruct_multi(bufr,N,wlength,nn)
// speed up the time series reconstruction
    wave bufr
    variable N,wlength,nn
    MatrixOp/o bufdiag=getDiag(bufr,-(N-wlength+1)+nn)
    variable meandiag
    meandiag=mean(bufdiag)
   
    return meandiag
end
<pre><code class="language-igor">
You have to make bufdiag a free wave: MatrixOp/o/FREE bufdiag=getDiag(bufr,-(N-wlength+1)+nn). Otherwise the parallel instances of reconstruct_multi will all work on the same bufdiag wave and overwrite each other.

You might consider multithreading your parent function ssa_multi instead. That should be more effective than only multithreading a subsection of it.