MatrixOP implementation questions

Hi everybody,

I'm doing a lenghty simulation at the moment and I'm stuck at the point where I need to improve calculation times. At the moment I'm still doing things like

wave[][] = wave1[p]*wave2[q]


but I'm just on my way changing things to

MatrixOP wave = wave1 x wave2^t
.

And here is my first problem: For calculations like above this is easy to understand, but I haven't found any way to implement something like

wave[][] = wave1[p]*wave2[q-p+offset]
wave[][] = (((p-q) > offset) | ((q-p)>offset)) ? 0 : wave[p][q]


I guess that I have to use

MatrixOP wavemap


but unfortunately I just don't get how to use it.

I would really appreciate some hints in solving the above problems. Additionally I need to convert

for (j = 0; j < numpnts; j += 1)
     output[j] = 0;
     for (i = j + offset - numpnts + 1; i < (j + offset); i += 1)
          output[j] += 1*real(-conj(inputB[j-i+offset])*conj(inputS[i][j]))+1*real(conj(inputA[j]))*magsqr(inputB[j-i+offset] );
     endfor
endfor


Otherwise my calculation needs ages. By the way, do you think of implementing the MatrixOP (and other) functions using the CUDA engine? I guess MatLab has some additional functions using CUDA.

Cheers,

B.S
Java wrote:

wave[][] = wave1[p]*wave2[q-p+offset]
wave[][] = (((p-q) > offset) | ((q-p)>offset)) ? 0 : wave[p][q]


MatrixOP was not designed to handle these type of expressions. Depending on the exact details you might formulate these expressions in terms of multi-threaded statements and get better performance.

As for:
Quote:


for (j = 0; j < numpnts; j += 1)
     output[j] = 0;
     for (i = j + offset - numpnts + 1; i < (j + offset); i += 1)
          output[j] += 1*real(-conj(inputB[j-i+offset])*conj(inputS[i][j]))+1*real(conj(inputA[j]))*magsqr(inputB[j-i+offset] );
     endfor
endfor



Approach this in three steps. First optimize the expressions, by removing unnecessary stuff as in:
Variable theSum
Variable jpo
for (j = 0; j < numpnts; j += 1)
     theSum= 0;
     jpo=j+offset
     for (i = jpo - numpnts + 1; i < jpo; i += 1)
          theSum+= real(-conj(inputB[jpo-i])*conj(inputS[i][j]))+real(conj(inputA[j]))*magsqr(inputB[jpo-i] );
     endfor
     output[j]=theSum
endfor


In the second step, precompute some of the functions using MatrixOP:

Variable theSum
Variable jpo
MatrixOP/O rcB=real(-conj(inputB))
MatrixOP/O cS=conj(inputS)
MatrixOP/O rcA=real(conj(inputA))
MatrixOP/O mB=magsqr(inputB)
for (j = 0; j < numpnts; j += 1)
     theSum= 0;
     jpo=j+offset
     for (i = jpo - numpnts + 1; i < jpo; i += 1)
          theSum+= rcB[jpo-i])*cs[i][j]+rcA[j]*mB[jpo-i];
     endfor
     output[j]=theSum
endfor


At this stage you could re-write the last expression as a function using something like:
// precompute these:
MatrixOP/O rcB=real(-conj(inputB))
MatrixOP/O cS=conj(inputS)
MatrixOP/O rcA=real(conj(inputA))
MatrixOP/O mB=magsqr(inputB)

MultiThread output=func1(x)   // assuming default scaling

...

ThreadSafe func1(j)
Variable j

NVAR offset=...
Variable theSum=0
Variable jpo=j+offset
Wave rcB,cS,rcA,mB

     for (i = jpo - numpnts + 1; i < jpo; i += 1)
          theSum+= rcB[jpo-i])*cs[i][j]+rcA[j]*mB[jpo-i];
     endfor
     
     return theSum
End


There are probably other tricks that one can use. I hope this helps give you some ideas. If you are still unhappy about performance feel free to send me at support@wavemetrics.com a more complete function that needs to be optimized.

A.G.
WaveMetrics, Inc.

Thanks for the reply and the helpful suggestions. But since I have to carry out more than one of these calculations I'm already doing this in a multithreaded way.

I dug deeper into the MatrixOp stuff and I can replace an expression like

wave[][] = inputB[q]*conj(inputB[q-p+offset])

with
MatrixOP outwave = wavemap(inputA x inputB^t,shiftwave)

with shiftwave being an appropriate indexing wave which I only have to create once. But unfortunately I have to use a complicated expression when using complex waves
MatrixOP mult = conj(inputA) x inputB^t
MatrixOP outwave = cmplx(wavemap(real(buffer),shiftwave),wavemap(imag(buffer),shiftwave))

because it looks like that wavemap cannot handle complex waves like one expects. Real and imaginary parts are completely mixed up.

Is this a known problem and can I somehow circumvent it?

Cheers,

B.S.
Hej A.G.,

thanks for the quick bugfix. You guys are really great. I already brought the calculation time down by a factor of about 2.5 :)

Today I stumbled over another issue. Is it possible to interpret a multiplication of a column vector with m entries with a m x n matrix in a way that the the first row of the matrix is multiplied by the first entry of the vector, the second row with the second entry of the vector and so on - and the same if you want to multiply a m x n matrix with a n entry row vector but now you multiply the columns of the matrix? I guess a function for addition in the same way is also very useful for some users.

Cheers,

B.S
Java wrote:

Today I stumbled over another issue. Is it possible to interpret a multiplication of a column vector with m entries with a m x n matrix in a way that the the first row of the matrix is multiplied by the first entry of the vector, the second row with the second entry of the vector and so on - and the same if you want to multiply a m x n matrix with a n entry row vector but now you multiply the columns of the matrix? I guess a function for addition in the same way is also very useful for some users.


I will just suggest that you try the following example and see where it takes you:

make/n=(10,10) ddd=p+10*q
make/n=10 sss=0.5+p
matrixop/o aa=scaleCols(ddd,sss)
edit aa


A.G.
WaveMetrics, Inc.
Hej A.G.,

scaleCols does the job for real waves. It looks that complex waves are not supported. Also trying the same for rows by assuming the name scalerows does not work. Just for the record, in my manual (Manual Revision: April 21, 2011 (6.22)), scaleCols is not mentioned. Is there an undocumented function for offsetting complex rows or columns?

Cheers,

B.S.
Java wrote:
Hej A.G.,

scaleCols does the job for real waves. It looks that complex waves are not supported. Also trying the same for rows by assuming the name scalerows does not work. Just for the record, in my manual (Manual Revision: April 21, 2011 (6.22)), scaleCols is not mentioned. Is there an undocumented function for offsetting complex rows or columns?


A quick Transpose (or ^t) would handle the difference between scaling rows and scaling columns.

This function is not in the manual because it has not been thoroughly tested for release. I'll see what I can do to extend this to rows but that will probably be for IP7.

AG