Removing row of matrix that contains specific value in first column

Hello. I m stuck on an issue that seems trivial, but somehow I must be missing some simple mistake. I want to make a function that takes a 2D wave and a specific value as input and deletes all rows of the 2D wave that contain this value in the first column. 

For example take the test matrix below:

I want to remove the rows containing 8.129 in the first column, so rows 2 and 4. So I have written a function:

#pragma TextEncoding = "UTF-8"
#pragma rtGlobals=3  
//(there are other functions in the procedure that use these specs, including here in case it makes a difference)

Function RemoveRowsContainingValue(Data_Matrix, Val)

    Wave Data_Matrix
    Variable Val
   
    Variable i
   
    For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)
        If(Data_Matrix[i][0] == Val)
            DeletePoints/M=0 i, 1, Data_Matrix
        EndIf
    EndFor

End

When I run RemoveRowsContainingValue(PM_Data_Test_Matrix, 8.129) though, it does not remove the rows. What is it I'm missing here? 

Thanks,

Hi,

It could be an issue with precision.  The value shown in the table may be truncated and not be precisely equal. Though I just tested your function where I created a value specifically 8.129 in the table and it worked as intended

Also I would work backwards through your matrix, because if you delete a row, the next time you index will be off.  As an example I but 8.129 into the first two rows and only the first row was deleted. First time through the loop i = 0 and it catches and deletes. Now that second 8.129 is in the 0 (zero) row because the initial one was deleted, but the i index is now at 1 and does not look at row 0 which was formally row 1 before the first deletion.

Andy

You are comparing two floating point numbers for equality.

Don't do that; use a comparison within a small delta.

Function RemoveRowsContainingValue(Data_Matrix, Val)

    Wave Data_Matrix
    Variable Val
   
    Variable i
    Variable Epsilon = val*1e-5
   
    For(i=0; i<(DimSize(Data_Matrix, 0)); i+=1)
        //If(Data_Matrix[i][0] == Val)
        If( abs(Data_Matrix[i][0] - Val) < Epsilon)
            DeletePoints/M=0 i, 1, Data_Matrix
        EndIf
    EndFor

End

 

In reply to by hegedus

I did already verify that the values in the matrix weren't just being truncated, so this wasn't the issue. But thank you for the suggestion to work backwards, this is smart for removing rows in a loop, so I have implemented this. 

This function will simply set the row to NaN.

Function ZapRowsContainingValue(Data_Matrix, value, [epsilon])
    wave Data_Matrix
    variable value, epsilon
   
    variable ic, tmp
   
    if (ParamIsDefault(epsilon))
        epsilon = 0.1
    endif
   
    for(ic=0; ic<(DimSize(Data_Matrix, 0)); ic+=1)
        tmp = abs(Data_Matrix[ic][0] - value)
        Data_Matrix[ic][] = (tmp < epsilon*value) ? NaN : Data_Matrix[p][q]
    endfor
   
    return 0
end

I had hopes to create this in a collapsed notation to avoid the for-endfor loop. I also might imagine a ZapNAN option that is multi-dimensional aware would remove the offending NaN rows.

With a for loop it's better to iterate backwards through the rows:

    For(i=DimSize(Data_Matrix, 0); i>=0; i--)
        // possibly delete ith row
    EndFor

otherwise you will skip a row when i increments. An additional advantage is that DimSize is calculated only once.

EDIT: I should read all the comments before posting :)

Here is a slightly different approach for the main task to eliminate rows (or columns) according so some criterion. The criterion needs to be converted into an index wave with 1 for "keep" or 0 for "delete". This may be more efficient for very large 2D waves compared to using DeletePoints (not tested though).

Requires IP9 or MatrixOP zapNaNs would beed to be replaced by WaveTransform.

function KillRowsOrCols(wave w2d, wave idx [, int dl])
    // w2d is an n x m matrix
    // idx is either 1D with numPnts(idx) = n or 2D with 1 x m columns
    // idx has values of either 1 or 0, if 0 at p or q, rows or cols of w2d will be eliminated
    // if dl is specified as non-zero integer DimLabels are preserved
    int nRows = DimSize(w2d,0)
    int nCols = DimSize(w2d,1)
    int dim = DimSize(idx,1) != 0 ? 1 : 0
    int nPoints = DimSize(idx,dim)
    int nKeeps = sum(idx)
    int i
    Duplicate/FREE idx temp
   
    if(dim == 0)
        // eliminate rows
        if(nRows != nPoints)
            print "Incompatible dimensions"
            return 0
        endif
       
        MultiThread temp = temp[p] == 1 ? p : NaN
        MatrixOP/FREE temp = zapNans(temp)
        Make/FREE/N=(nKeeps, nCols) out
        MultiThread Out = w2d[temp[p]][q]
       
        if(!paramIsDefault(dl))
            CopyDimlabels/Cols=1 w2d, out
            for(i=0; i<nKeeps; i++)
                SetDimlabel 0, i, $GetDimLabel(w2d, 0, temp[i]), out
            endfor
        endif
       
    else
        // eliminate columns
        if(nCols != nPoints)
            print "Incompatible dimensions"
            return 0
        endif
        MultiThread temp = temp[0][q] == 1 ? q : NaN
        MatrixOP/FREE temp = zapNans(temp)^t
        Make/FREE/N=(nRows, nKeeps) out
        MultiThread Out = w2d[p][temp[q]]  
       
        if(!paramIsDefault(dl))
            CopyDimlabels/Rows=0 w2d, out
            for(i=0; i<nKeeps; i++)
                SetDimlabel 1, i, $GetDimLabel(w2d, 1, temp[i]), out
            endfor
        endif
    endif
    Duplicate/O out, w2D
end

 

First, I'd like to assume that you know how to identify the rows that you want to eliminate.  Next, it is useful to remember that it is more efficient to eliminate columns than rows so:

1. transpose your input matrix (say inMat).  The new matrix inMat^t dimensions are Nr by Nc.

2. Create a 1D wave w1d of Nc points set to 1 for cols that you want to keep and NaN for cols that you want to delete.

3. Execute MatrixOP scaleCols() to set the cols to be deleted to NaN

4. Execute MatrixOP zapNaNs() to remove the NaNs

5. Redimension to the new rows and cols

6. Transpose the final matrix.

All this can be done in one line of code:

MatrixOP/O newMat=Redimension(zapNaNs(scaleCols((inMat^t),w1d)),numCols(inMat),sum(zapNaNs(w1d)))^t

 

Hello A.G.,

thanks for the one-liner, I need to remove cols/rows quite often and your version is about 30-40% faster!

And I realised that I can use a 1-row wave where I assumed I need to provide a variable, here e.g. sum(w) in Redimension.

Still much to learn about MatrixOP!

Hello CharLie,

In IP10 you will have a MatrixOP removeCol() function.

I have not decided if it is worth the time to implement removeRow() or use the transpose operator.

A.G.