
function speed-up: extracting point numbers
Now I want to find the point/index numbers of AA where the values of BB occur. Finding means I duplicate AA and set a point to 1 if the number is within BB or 0 if not.
The function looks like:
function FindIt(aa, bb) wave aa, bb variable timer = startMStimer Make/O/N=(numPnts(aa)) cc=0 variable i for(i=0; i<numPnts(bb); i+=1) Multithread cc += aa == bb[i] ? 1 : 0 endfor Printf "Time required: %f seconds\r", (stopMStimer(timer)/1e6) end
However it is annoyingly slow. For the wave lengths given above it takes on my work computer about 25s.
I'm convinced there is a better way but I just don't see it at the moment!
Any ideas are highly appreciated!
At least your code does not extract all bb indizes of the contents of aa.
So I can only guess what you want to do, so here is my try:
Numbers from my machine are 8.7s for your solution and 2.4s for my try.
In case this is still too slow you would need to give more info about the situation. Is bb constant? If yes, you could try to sort bb and use a binary search[1].
Do you need the contents of aa afterwards? If no, you can just overwrite the contents of aa and skip creating cc.
[1]: http://www.igorexchange.com/node/2775
November 1, 2013 at 11:20 am - Permalink
thanks for your reply!
I have attached an experiment with real world data to make it more clear.
Wave AA and BB are sorted and are positive integers. Wave AA can contain multiple entries of e.g. BB[3] - I think this is why I can't use FindValue, because it will only find the first entry of BB[3] (?). If you execute FindIt(aa, bb) then e.g. cc[0] and cc[211] will be 1 because bb[0] and bb[1] are present in aa at these point numbers.
This is what I want to get out, or a wave which contains the actual point numbers, e.g. {0, 211, ...}
November 1, 2013 at 01:15 pm - Permalink
And as I see you are not interested in the index but just if zero I tried with
November 1, 2013 at 01:53 pm - Permalink
I can also use a wave with the point numbers as output and use something like (I still iterate over bb):
I need to check if this gives me correct data at the boundaries but I get a speedup compared to my initial version by a factor 10. This is very nice! Thanks!
November 1, 2013 at 02:58 pm - Permalink
Suppose you wrote the following expression
and executed this in a loop.
Note also that your loop should ideally use a fixed variable rather than calling a function. For N equal() calls I'd use something like
Hope this helps,
AG
November 1, 2013 at 04:36 pm - Permalink
Perhaps you can create the output wave at full size, initialize it to NaN, and then zap any NaNs at the end using WaveTransform zapNaNs.
November 1, 2013 at 06:39 pm - Permalink
At least your code does not extract all bb indizes of the contents of aa.
[/quote]
Thomas, I should have written this at the beginning: Note that WaveMin and WaveMax of aa and bb are identical. This is why iterating over bb is sufficient. What I do is finding the values of bb in an image M_aa. Only a small proportion of M_aa falls within WaveMin(bb) and WaveMax(bb). This is why I redimension M_aa into aa, duplicate and create a point number wave pn=p and then Sort aa aa,pn. Then I can clip aa to the values of interest. Using the output from above and the clipped pn, I can reconstruct the pixel coordinates. This should be more efficient than ImageThreshold, since only a small proportion of the image needs attention.
My last version version still had some issues, but now it's running fine. I also got rid of InsertPoints, but it only made it marginally faster.
Thanks for all suggestions and pointing me to FindValue!
November 3, 2013 at 11:35 pm - Permalink
Note that WaveMin and WaveMax of aa and bb are identical.
[/quote]
Ah that's the missing information which clarifies the differences between our approaches.
November 4, 2013 at 01:49 am - Permalink