Visualizing a correlation matrix using color

Hello, 

I am currently using the "scatter plot matrix.ipf" to visualise the correlation of a large number of parameters (4x 8). I would like to be able to present this data where each plot is represented by the r2 value (or a color representative of its r2 value), rather than the scatter plot itself. I have searched around the help files in IGOR but have not come across the code for this. 

Is there a prepared .ipf available for this in IGOR?

Thank you all in advance (again!)

 

As far as the color representation,  the NewImage procedure (e.g., "NewImage <correlation matrix wave name>") would work. Then go to the menu Image -> Modify Image Appearance to adjust or change the mapping from matrix values to colors.

Thank you for the quick reply.

Is there a means to generate this correlation matrix from the scatter plot matrix.ipf. Using the scatter plot matrix it is possible to perform the fit on each set of variables, however these fits appear only on the plot itself and do not appear to be saved in the data browser. 

 

I see. I'm not sure that that ipf calculates the r2 value.

The function statsCorrelation (a built-in function) will compute r values, and it can generate a matrix of them. One (severe) downside is that it cannot handle NaN values (so it doesn't work on the examples in the package, without preprocessing).

In the case of NaNs, it's probbably best to loop through all the combinations with custom-written code, removing rows where either of the input waves has an NaN, and, calling statsCorrelation on the processed pairs, and filling in the matrix yourself. Perhaps there is another package out there that does this.

Thanks for getting back to me.

 

I don't see (from the IGOR help information) how the Stats Correlation function can generate a matrix of the correlation values. 

I have generated a matrix (with all nans removed - attached) to see if there is a way (in IGOR) to generate this correlationmatrix.

I also tried to use matrixop/o R2= Correlate(dataforcorrelationmatrix,2,4)

I am not quite sure how to correctly apply this function. From the IGOR help file I get "correlate(w1,w2,opt)"

The results is that I have a second matrix (r2) with the same dimensions as dataforcorrelation but with no information in it. 

Thanks in advance for your suggestions

 

Data4correlation.txt

In reply to by LaMP

Here's an example of how the generate a matrix of r values with statsCorrelation. See the documentation for statsCorrelation for more details about the matrix that it creates. 

make/n=(50,5)/o ex
•ex[][0]=p
•ex[][1] = -p //anticorrelated
•ex[][2]=floor(p/10)*10 //somewhat correlated
•ex[][3]=floor(p/3)*3 //somewhat more correlated
•ex[][4]=-floor(p/10)*10 //somewhat anticorrelated
display/k=1 ex[][0],ex[][1],ex[][2],ex[][3],ex[][4]
Variable dummy = statscorrelation(ex)  //this is a function but for a single wave input the return value doesn't matter. r is instead stored in M_Pearson
newimage/k=1 M_Pearson       //matrix of r values from statsCorrelation
edit/k=1 M_Pearson.ld

If you arrange your data into columns in a 2D wave and pass it to statsCorrelation as in this code (and without NaNs), it will put the matrix of r values in the wave M_Pearson. 

The correlate operation you mentioned does not return Pearson r values; it's for cross-correlation of (for example) one time-varying signal with another.

I have to break in here: I think the original request was for r^2, the coefficient of determination that you get from a line fit, not the cross-correlation r.

You might get the raw data for your matrix from statsLinearRegression, but you will have to generate a list of the XY pairs that consists of all possible pairings of the waves involved.

Scatter Plot Matrix does not have an option to generate such a matrix.

I was only suggesting the use of statsCorrelation because I thought r^2, the coefficient of determination, could be computed as the square of the Pearson r provided by statsCorrelation.

Hi GSB (John weeks), 

This code is perfect. The pearsons R is also great. 

I have one last question. I would like to <appendimage M_pearson vs {textwave, *}> to label each of the parameters in the correlation matrix, rather than just by row/column number. However IGOR is looking for a numerical wave. Is there a way to label the axis with a text wave? 

Thanks again for all the help :)! 

In reply to by LaMP

I am not sure that appendImage can take a text wave. One option is to use the ModifyGraph userticks keyword.

A simple way to get started with the userTicks option is to go to the Graph menu and choose Modify Axis. Choose the axis of interest and then in the drop down menu choose "User Ticks from Waves" instead of "Auto Ticks". Then on the right click the "New From Auto Ticks" button. Click the do it button, and then edit the value and label waves that are generated so that they show the text you want where you want it. Note that deleting rows will remove those tick marks.

I often like to use this approach along with ModifyGraph tkLblRot (also in the label options tab of the modify axis menu). 

Hello,

I have an additional request regarding this representation of a correlation matrix as an image. Is it possible to include the numerical value that corresponds to the colour in the image file? (the persons correlation value). 

Thanking you in advance for your support. 

Best wishes

Thank you for your rapid response. However, I would like to place each value in the color square itself (e.g superposing the numerical matrix onto the image). I would like to be able to show the exact value, which is difficult to do with the standard colour scale. 

 

One method might be to prepare an XY pair of waves with the positions of the cells (the mod() function and floor() are useful here) and then a third wave with the cell values. Use text markers with the third wave setting the markers.

Another possibility would be to put a small annotation on each cell. You're going to want to write code to do that...