# maximum matrix sizes for MatrixSVD and PCA

I'm trying to get familiar with the PCA operation, the output and its meaning, ultimately with the goal of applying it to large data sets.
I was wondering if there is a clear restriction in the matrix sizes, e.g.

Make/O/N=(1000,1000) M_in=trunc(abs(enoise(100)))
MatrixSVD M_in
PCA/SCMT M_in

would execute, whereas as matrix with the same number of points but n(rows) >> n(cols), such as

Make/O/N=(100000,10) M_in=trunc(abs(enoise(100)))

would produce "out of memory" errors. Are there ways to avoid this (on Mac OS)?
In 32bit IGOR you are clearly running out of memory. You need to consider the dimensions of the internal matrix that is generated from your input to PCA. The two commands
MatrixSVD M_in
PCA/SCMT M_in

are not equivalent because the PCA operation still prepares an input "D" matrix for you unless you specify the /U flag.

If you are interested in PCA you might also be interested in IP7 built-in ICA.

A.G.
WaveMetrics, Inc.
Hello A.G.

Igor wrote:
In 32bit IGOR you are clearly running out of memory. You need to consider the dimensions of the internal matrix that is generated from your input to PCA. The two commands
MatrixSVD M_in
PCA/SCMT M_in

are not equivalent because the PCA operation still prepares an input "D" matrix for you unless you specify the /U flag.

Yes, I'm aware of that. Initially I tried PCA "by hand" to better understand Igor's PCA output. This is when I found that some matrices were too large for MatrixSVD - those sizes also failed on PCA - I guess because it uses the same SVD code.

Igor wrote:

If you are interested in PCA you might also be interested in IP7 built-in ICA.

I am VERY interested in IP7 ;-)

Do you have any recommendation if my matrices for PCA would as large as above?

EDIT:
it seems that the /LEIV flag avoids out-of-memory issues, although I'm not sure (yet) about the consequences...
ChrLie wrote:

I am VERY interested in IP7 ;-)

In that case I recommend that you contact support for information about the technology preview program.
Quote:
Do you have any recommendation if my matrices for PCA would as large as above?

The problem, as you know, is that the covariance matrix gets to be too large for 32 bit address space. In some situations you do not need to work with the covariance matrix. Use MatrixOP to condition your input and then use /U.

Quote:
EDIT:
it seems that the /LEIV flag avoids out-of-memory issues, although I'm not sure (yet) about the consequences...

This flag helps but only AFTER the operation creates the covariance matrix. In most situations there are less components than there are rows in your data so there is no sense to spend time computing eigenvalues corresponding to noise. This is where it makes sense to limit the eigenvalues range with the /LIEV flag. This reminds me that in IP7 you have a new option to compute partial svd.

A.G.
WaveMetrics, Inc.