# Removing row of matrix that contains specific value in first column

arnold.downey

Hello. I m stuck on an issue that seems trivial, but somehow I must be missing some simple mistake. I want to make a function that takes a 2D wave and a specific value as input and deletes all rows of the 2D wave that contain this value in the first column.

For example take the test matrix below:

I want to remove the rows containing 8.129 in the first column, so rows 2 and 4. So I have written a function:

#pragma TextEncoding = "UTF-8"

#pragma rtGlobals=3

//(there are other functions in the procedure that use these specs, including here in case it makes a difference)

Function RemoveRowsContainingValue(Data_Matrix, Val)

Wave Data_Matrix

Variable Val

Variable i

For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)

If(Data_Matrix[i][0] == Val)

DeletePoints/M=0 i, 1, Data_Matrix

EndIf

EndFor

End

#pragma rtGlobals=3

//(there are other functions in the procedure that use these specs, including here in case it makes a difference)

Function RemoveRowsContainingValue(Data_Matrix, Val)

Wave Data_Matrix

Variable Val

Variable i

For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)

If(Data_Matrix[i][0] == Val)

DeletePoints/M=0 i, 1, Data_Matrix

EndIf

EndFor

End

When I run RemoveRowsContainingValue(PM_Data_Test_Matrix, 8.129) though, it does not remove the rows. What is it I'm missing here?

Thanks,

Hi,

It could be an issue with precision. The value shown in the table may be truncated and not be precisely equal. Though I just tested your function where I created a value specifically 8.129 in the table and it worked as intended

Also I would work backwards through your matrix, because if you delete a row, the next time you index will be off. As an example I but 8.129 into the first two rows and only the first row was deleted. First time through the loop i = 0 and it catches and deletes. Now that second 8.129 is in the 0 (zero) row because the initial one was deleted, but the i index is now at 1 and does not look at row 0 which was formally row 1 before the first deletion.

Andy

December 8, 2022 at 10:12 am - Permalink

You are comparing two floating point numbers for equality.

Don't do that; use a comparison within a small delta.

Wave Data_Matrix

Variable Val

Variable i

Variable Epsilon = val*1e-5

For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)

//If(Data_Matrix[i][0] == Val)

If( abs(Data_Matrix[i][0] - Val) < Epsilon)

DeletePoints/M=0 i, 1, Data_Matrix

EndIf

EndFor

End

December 8, 2022 at 10:18 am - Permalink

In reply to Hi, It could be an issue… by hegedus

I did already verify that the values in the matrix weren't just being truncated, so this wasn't the issue. But thank you for the suggestion to work backwards, this is smart for removing rows in a loop, so I have implemented this.

December 8, 2022 at 10:34 am - Permalink

In reply to You are comparing two… by JimProuty

Thank you for this, JimProuty. I didn't realize this kind of comparison is problematic for floating points, so I implemented your approach and the function now works as expected!

December 8, 2022 at 10:36 am - Permalink

This function will simply set the row to NaN.

wave Data_Matrix

variable value, epsilon

variable ic, tmp

if (ParamIsDefault(epsilon))

epsilon = 0.1

endif

for(ic=0; ic<(DimSize(Data_Matrix, 0)); ic+=1)

tmp = abs(Data_Matrix[ic][0] - value)

Data_Matrix[ic][] = (tmp < epsilon*value) ? NaN : Data_Matrix[p][q]

endfor

return 0

end

I had hopes to create this in a collapsed notation to avoid the for-endfor loop. I also might imagine a ZapNAN option that is multi-dimensional aware would remove the offending NaN rows.

December 8, 2022 at 04:47 pm - Permalink

With a for loop it's better to iterate backwards through the rows:

// possibly delete ith row

EndFor

otherwise you will skip a row when i increments. An additional advantage is that DimSize is calculated only once.

EDIT: I should read all the comments before posting :)

December 9, 2022 at 12:15 am - Permalink

Here is a slightly different approach for the main task to eliminate rows (or columns) according so some criterion. The criterion needs to be converted into an index wave with 1 for "keep" or 0 for "delete". This may be more efficient for very large 2D waves compared to using DeletePoints (not tested though).

Requires IP9 or MatrixOP zapNaNs would beed to be replaced by WaveTransform.

// w2d is an n x m matrix

// idx is either 1D with numPnts(idx) = n or 2D with 1 x m columns

// idx has values of either 1 or 0, if 0 at p or q, rows or cols of w2d will be eliminated

// if dl is specified as non-zero integer DimLabels are preserved

int nRows = DimSize(w2d,0)

int nCols = DimSize(w2d,1)

int dim = DimSize(idx,1) != 0 ? 1 : 0

int nPoints = DimSize(idx,dim)

int nKeeps = sum(idx)

int i

Duplicate/FREE idx temp

if(dim == 0)

// eliminate rows

if(nRows != nPoints)

print "Incompatible dimensions"

return 0

endif

MultiThread temp = temp[p] == 1 ? p : NaN

MatrixOP/FREE temp = zapNans(temp)

Make/FREE/N=(nKeeps, nCols) out

MultiThread Out = w2d[temp[p]][q]

if(!paramIsDefault(dl))

CopyDimlabels/Cols=1 w2d, out

for(i=0; i<nKeeps; i++)

SetDimlabel 0, i, $GetDimLabel(w2d, 0, temp[i]), out

endfor

endif

else

// eliminate columns

if(nCols != nPoints)

print "Incompatible dimensions"

return 0

endif

MultiThread temp = temp[0][q] == 1 ? q : NaN

MatrixOP/FREE temp = zapNans(temp)^t

Make/FREE/N=(nRows, nKeeps) out

MultiThread Out = w2d[p][temp[q]]

if(!paramIsDefault(dl))

CopyDimlabels/Rows=0 w2d, out

for(i=0; i<nKeeps; i++)

SetDimlabel 1, i, $GetDimLabel(w2d, 1, temp[i]), out

endfor

endif

endif

Duplicate/O out, w2D

end

December 9, 2022 at 02:48 am - Permalink

First, I'd like to assume that you know how to identify the rows that you want to eliminate. Next, it is useful to remember that it is more efficient to eliminate columns than rows so:

1. transpose your input matrix (say inMat). The new matrix inMat^t dimensions are Nr by Nc.

2. Create a 1D wave w1d of Nc points set to 1 for cols that you want to keep and NaN for cols that you want to delete.

3. Execute MatrixOP scaleCols() to set the cols to be deleted to NaN

4. Execute MatrixOP zapNaNs() to remove the NaNs

5. Redimension to the new rows and cols

6. Transpose the final matrix.

All this can be done in one line of code:

`MatrixOP/O newMat=Redimension(zapNaNs(scaleCols((inMat^t),w1d)),numCols(inMat),sum(zapNaNs(w1d)))^t`

December 9, 2022 at 05:35 pm - Permalink

Hello A.G.,

thanks for the one-liner, I need to remove cols/rows quite often and your version is about 30-40% faster!

And I realised that I can use a 1-row wave where I assumed I need to provide a variable, here e.g. sum(w) in Redimension.

Still much to learn about MatrixOP!

December 13, 2022 at 11:33 pm - Permalink

Hello CharLie,

In IP10 you will have a MatrixOP removeCol() function.

I have not decided if it is worth the time to implement removeRow() or use the transpose operator.

A.G.

December 14, 2022 at 05:27 pm - Permalink