Independent component analysis

Hello,

i would like to have Indpendent Component Analysis in the next version of Igor. I know the "simpleICA" procedure by AG - it works quite well for smaller datasets but it would be nice to have a faster alternative which knows some additional parameters (e.g. choice of contrast function or maybe even a choice for different algorithms like fastICA or Infomax etc). AG informed me that ICA is planned for IP7 but i wanted to state that there is really at least one user who would like this feature :-)
Currently i am trying temporal ICA on highly overdetermined datasets (6400 detectors and approx. 2-50 sources, the length of the recordings is 1000 to 3000 datapoints) and i would love to see the results earlier. If it matters: I need the independent components, the mixing matrix (often called A) and the unmixing matrix (W).

Thank you,

Klaus
Hello Klaus,

ICA has already been implemented in IP7 using the fastICA algorithm.

A.G.
WaveMetrics, Inc.
I feel compelled to point out that IP7 is Igor Pro 7, which is still under development (that is, not released yet).

--Jim Prouty
Software Engineer, WaveMetrics, Inc.
Given the time that has passed, I would like to echo Klaus's point. SimpleICA is nice, but if you need to run ICA on large datasets, it is just not an option at all.

Given that IP7 still appears to be over a year away, is there any hope of an XOP or something along those lines becoming available ahead of that?

Cheers, Chris
Hi Chris,

its nice to see this topic revived after such a long time. This motivates me to report the "solution" that I have been using since my last post. First I checked (or persuaded colleagues to check) the alternatives in Phyton, R and Matlab. We may not have always tested the most efficient implementations but it turned out that none of these is really fast: For my datasets (approx. 6400 x 2000) it took always more than an hour on my vintage laptop. Compared to that, the simpleICA procedure from AG was not so bad (approx. 90min). When I tried to find out what takes such a long time, I found that it was the PCA / SVD that is carried out before the actual ICA. While I was searching for a way to speed up this step I stumbled accross the irlba Package for R ("implicitly restarted Lanczos bidiagonalization", http://illposed.net/irlba.html). This is doing a partial SVD on large matrices very fast. There are also other algorithms for the computation of partial SVD or partial PCA but this helped already so much that I did not try anything else.
So here is my "mixed" solution in Igor and R: I load and precondition my data in Igor (some filtering, replacing NaNs and constant data traces). Igor then writes the data to disk as an .ibw file and calls R to read it (with the help of the IgoR package: http://cran.r-project.org/web/packages/IgorR/). R computes a partial SVD, writes the data back to disk (also as .ibw), Igor reads it and computes the ICA with a slightly modified version of the simpleICA procedure. This is all automatic and takes less than two minutes whith a typical dataset! As my programming skills are limited there is still potential for improvement. But the gain in speed is already large enough for me.
Maybe this helps someone (or motivates someone to write a partial SVD procedure for Igor :-)

Cheers,

Klaus
Klaus wrote:

Maybe this helps someone (or motivates someone to write a partial SVD procedure for Igor :-)
Klaus


The simpleICA procedure and the built-in ICA are not optimized for large sparse matrices. Indeed one might expect O(N^3) in computing the associated PCA so shortcuts may be useful when appropriate. We will be looking at implementing built-in support for partial PCA. In the meantime I recommend using the various flags for MatrixSVD that limit the calculation to the smallest input dimension.

A.G.
WaveMetrics, Inc.
For all those interested in partial SVD here is a procedure that illustrates one method for calculating a limited number of singular values.
#pragma rtGlobals=3     // Use modern global access method and strict wave access.

// The following function implements the Power method for partial SVD as described
// in the Appendix of J.C. Nash and S.Shlien "Simple Algorithms for the Partial Singular Value
// Decomposition", The Comp. J. (30) No. 3 1987.
// Inputs:
// matA -- a real-valued matrix of dimensions (rows,cols)
// numRequestedSingularValues -- positive number that is smaller than Min(rows,cols)
// iterations -- the maximum number of iterations for calculating one singular value.
// Outputs:
// sValues: real wave containing numRequestedSingularValues points ordered from largest to
//          smallest singular values.
// uMat:    the partial matrix U
// vMat     the partial matrix V
// The complete SVD is in the form matA=uMat x sMat x vMat^t
// where sMat is a diagonal matrix containing the entries in the wave sValues.

Constant kDelta=1e-6
constant kEps=1e-14
constant kTol=1e-15

Function partialSVD(matA,numRequestedSingularValues,iterations)
    Wave matA
    Variable numRequestedSingularValues,iterations
   
    Variable rows=DimSize(matA,0)
    Variable cols=DimSize(matA,1)
   
    Make/O/N=(numRequestedSingularValues) sValues=nan
    Make/O/N=(rows,numRequestedSingularValues) uMat=nan
    Make/O/N=(cols,numRequestedSingularValues) vMat=nan

    Make/D/Free/N=(cols) pVec,qVec,mv,rVec
    Duplicate/O/FREE matA,matAloc
   
    Variable iter
    Variable values,singularValue
    for(values=0;values<numRequestedSingularValues;values+=1)
        if(values>0 && mv[0]>10*kEps)
            MatrixOP/O/FREE pVec=pVec-rVec
        else
            pVec=0.5*(1+enoise(1))
        endif
        MatrixOP/O pVec=Normalize(pVec)
        for(iter=0;iter<iterations;iter+=1)
            MatrixOP/FREE qVec=Normalize(matAloc x pVec)
            MatrixOP/FREE rVec=(matAloc^t) x qVec
            MatrixOP/Free sVec=sqrt(rVec^t x rVec)
            MatrixOP/FREE rVec=Normalize(rVec)
            MatrixOP/FREE mv=MaxVal(abs(rVec-pVec))
            if(mv[0]<kDelta)
                break
            else
                Duplicate/O/FREE rVec,pVec          // use for next iteration
            endif
        endfor
       
        if(iter==iterations)
            Print "Failed to converge"
        endif
        singularValue=sVec[0]
        sValues[values]=singularValue
        if( (singularValue/(sValues[0]+kTol))<kTol)
            Print "Null matrix?"
            break
        endif
       
        uMat[][values]=qVec[p]
        vMat[][values]=rVec[p]
        // deflation:
        MatrixOP/O/FREE matAloc=matAloc-singularValue*(qVec x rVec^t)
    endfor
End