Need Very Fast Range-within-Waves Extraction Algorithm, My Two Solutions Are Too Slow

Background: I'm working with LOLA point data (LOLA being the laser altimeter on the Lunar Reconnaissance Orbiter). The entire dataset is gigantic, over 6.4 billion points at the moment. I'm dividing my work areas into 10°x10° (with 0.2° buffers) regions to have a more manageable ~10-11 million points. I'm using the data to analyze craters. The LOLA data is in 3 1D waves: Latitude, longitude, elevation. I'm also using WAC DTMs (Lunar Reconnaissance Orbiter Camera Wide-Angle Camera (WAC) Digital Terrain Model (DTM) -- a gridded topography dataset) that is displayed as, effectively a basemap with the LOLA point data overlaid.

Problem: I have code that zooms to the next crater in my list to study. Because the actual analysis region for that crater is much smaller than 10°x10°, my code extracts the WAC DTM around the crater and the LOLA point data. The WAC extraction is very fast, taking around 0.3 seconds. The LOLA extraction is MUCH slower, taking over 10 seconds. I can't have a workflow that takes that long to get to the next crater.

Current Code: My current approach is this: The LOLA point data lists are sorted already by latitude. I use FindLevel to location the row of my max and min latitude range that I want to extract, and then I copy those rows to new waves. I then sort by longitude. It's the sorting that takes the longest since sorting is one of the slowest basic things you can do. I then use FindLevel again to get the rows for my min and max longitude, delete the other rows from the temporary waves, and I have my range:

//Extract and set scale.  PEDR *MUST* be pre-sorted by Pt_Latitude.
Variable pt_start, pt_end
if((crater_center_long_index-radius_long) >= Pt_Longitude[0])
    FindLevel/P/Q Pt_Longitude, (crater_center_long_index-radius_long)
    pt_start = V_LevelX
else
    pt_start = 0
endif
if((crater_center_long_index+radius_long) <= Pt_Longitude[numpnts(Pt_Longitude)])
    FindLevel/P/Q Pt_Longitude, (crater_center_long_index+radius_long)
    pt_end = V_LevelX
else
    pt_end = numpnts(Pt_Longitude)
endif
Duplicate/O/R=[pt_start, pt_end] Pt_Latitude, Pt_Lat_Temp
Duplicate/O/R=[pt_start, pt_end] Pt_Longitude, Pt_Lon_Temp
Duplicate/O/R=[pt_start, pt_end] topography, Pt_Ele_Temp

print (stopMSTimer(-2)-timer_start)*1e-6
Sort Pt_Lat_Temp Pt_Lon_Temp, Pt_Ele_Temp, Pt_Lat_Temp
if((crater_center_lat_index-radius_lat) >= Pt_Lat_Temp[0])
    FindLevel/P/Q Pt_Lat_Temp, (crater_center_lat_index-radius_lat)
    pt_start = V_LevelX
else
    pt_start = 0
endif
if((crater_center_lat_index+radius_lat) <= Pt_Lat_Temp[numpnts(Pt_Lat_Temp)])
    FindLevel/P/Q Pt_Lat_Temp, (crater_center_lat_index+radius_lat)
    pt_end = V_LevelX
else
    pt_end = numpnts(Pt_Lat_Temp)
endif
DeletePoints pt_end, (numpnts(Pt_Lat_Temp)-pt_end), Pt_Lat_Temp, Pt_Lon_Temp, Pt_Ele_Temp
DeletePoints 0, pt_start, Pt_Lat_Temp, Pt_Lon_Temp, Pt_Ele_Temp
print (stopMSTimer(-2)-timer_start)*1e-6


Potential Speed-Up: I thought that, perhaps instead of sorting, I could just loop through every point and check to see if it's in my longitude range. Problem is, that takes much longer. The code is the same as above, just change the last block, the one within the timing output:

print (stopMSTimer(-2)-timer_start)*1e-6
Variable counter_dummy = 0
do
    if(Pt_Lat_Temp[counter_dummy] < (crater_center_lat_index-radius_lat))
        DeletePoints (counter_dummy),1, Pt_Lat_Temp, Pt_Lon_Temp, Pt_Ele_Temp
    elseif((crater_center_lat_index+radius_lat) > Pt_Lat_Temp[counter_dummy])
        DeletePoints (counter_dummy),1, Pt_Lat_Temp, Pt_Lon_Temp, Pt_Ele_Temp
    else
        counter_dummy += 1
    endif
while(counter_dummy < numpnts(Pt_Lat_Temp))
print (stopMSTimer(-2)-timer_start)*1e-6


Wave Size: To give you a very, very rough idea of wave sizes, the initial LOLA point list is ~10-11 million. Extracting the longitude range gets me down to around 2 million, so I'm then re-sorting a list of 2M points. When the longitude range is extracted, we're talking about ~200,000 points in the final temporary arrays (I say this is very very rough because it varies depending on LOLA track density and crater size I'm working on). That's why I thought just iterating through the list might be faster, and I expect it would be in a more machine-level language than Igor. But, in Igor, it's not. And yes, this is wrapped in a Function, not a Macro, though the timing is indistinguishable really regardless of it being a Function or a Macro (I tried).

Help?
Rather than FindLevel, try BinarySearch.

Rather than FindLevel or iterating through the subset, try Extract.

In fact, it might be faster to use Extract on the original data set rather than making the subset copy.

--Jim Prouty
Software Engineer, WaveMetrics, Inc.
Some ideas:
- Both FindLevel calls which search for pt_end don't have to start searching at the 0th point but at the pt_start point, no? See the /R switch of FindLevel.
- The more fancier FindLevel operation supports a /M flag for the number of maximum hits. And I guess you are satisfied with the first hit, no?
Btw. as you are working with sorted waves a binary search should be more appropriate than FindLevel.
- All operations involving memory are dead slow as always. So if you can avoid some of the Duplicate/DeletePoints calls things get really much faster. E.g. I don't see that you use the duplicated topography wave.
- Have a try with FREE waves for temporary waves
- Replace the two DeletePoints calls with one to Duplicate

In case you post a sample experiment, everyone can playaround some more :)
JimProuty wrote:
Rather than FindLevel, try BinarySearch.

I could modify it to do that, but the FindLevel step takes zero time relative to the Sort line. Like deleting a 33kB Word document instead of the 60GB music folder, I think changing to BinarySearch isn't going to save me time at this point.

JimProuty wrote:
Rather than FindLevel or iterating through the subset, try Extract.

Hmmmm... that looks really promising, except that I have 3 waves (lat/lon/ele) and I need to keep the points together (extracting the elevation points corresponding to my lat/lon range). Looking at the documentation for Extract, it doesn't look like I can use it for this kind of problem. FindLevel was useful because it gave me the index range of my data, and I could then extract those rows from each of the three waves. Is there a way to use Extract to be able to do the same kind of thing? If so, that would seem to eliminate the Sort step entirely. Maybe with the /INDX flag?
Sorry, I think I posted while you were posting, so I missed this. I think I have a solution, so I apologize if I say "moot point now" in response.

thomas_braun wrote:
Both FindLevel calls which search for pt_end don't have to start searching at the 0th point but at the pt_start point, no? See the /R switch of FindLevel.

If I knew the range of the data a priori, I wouldn't need to search through it. Since I don't, and since the LOLA data are not uniform across the lunar surface, I can't assume a range.

thomas_braun wrote:
The more fancier FindLevel operation supports a /M flag for the number of maximum hits. And I guess you are satisfied with the first hit, no?

Yes. Since it's a sorted list, I just need the first one. I use that to know where the minimum longitude is in my list. Next FindLevel gives me the maximum longitude. Then I extract the data in that range.

thomas_braun wrote:
All operations involving memory are dead slow as always. So if you can avoid some of the Duplicate/DeletePoints calls things get really much faster. E.g. I don't see that you use the duplicated topography wave.

The duplicated waves (_Temp) are used in the final display. There's no math in this function that uses it, but it's what's displayed versus the Lat/Lon _Temp and it's what's used in later analysis. Timing test shows the Duplicate lines take ~5% of the time of the function.

thomas_braun wrote:
Have a try with FREE waves for temporary waves

Haven't experimented with those before. Useful? Faster? Purpose?

thomas_braun wrote:
Replace the two DeletePoints calls with one to Duplicate

Moot point, see my next post.

thomas_braun wrote:
In case you post a sample experiment, everyone can playaround some more :)

Experiment is 400 MB. Don't think I'll be posting it.
Okay, Extract/INDX seems to be much, much faster. Takes about a half second instead of 10 to sort (for this particular dataset). What takes 20% of the time for the code to run now is my do-while loop, see below. I'm guessing there's a much faster Igor-y way to do that step?

print (stopMSTimer(-2)-timer_start)*1e-6
//Use "Extract" to get the row numbers (index points) of the latitude points in my range,
//  storing it to "index_Temp" to use in a moment.
Extract/O/INDX Pt_Lat_Temp,index_Temp, (((crater_center_lat_index-radius_lat) < Pt_Lat_Temp) && (Pt_Lat_Temp < (crater_center_lat_index+radius_lat)))
print (stopMSTimer(-2)-timer_start)*1e-6
//Go through with index_Temp and extract the points that I want.  Then, delete the remaining
//  ones.  Yes, I'm over-writing the waves I originally got the data from, but that's okay since
//  the number of points MUST be less than the original, and so I am not over-writing points I
//  want.
Variable counter_dummy = 0
do
    Pt_Lat_Temp[counter_dummy] = Pt_Lat_Temp[index_Temp[counter_dummy]]
    Pt_Lon_Temp[counter_dummy] = Pt_Lon_Temp[index_Temp[counter_dummy]]
    Pt_Ele_Temp[counter_dummy] = Pt_Ele_Temp[index_Temp[counter_dummy]]
    counter_dummy += 1
while(counter_dummy < numpnts(index_Temp))
DeletePoints counter_dummy,(numpnts(Pt_Lat_Temp)-counter_dummy), Pt_Lat_Temp,Pt_Lon_Temp,Pt_Ele_Temp
print (stopMSTimer(-2)-timer_start)*1e-6


The output from the three print [time] blocks for a few craters of different sizes (and hence different data volumes are, in decreasing order of crater size:

42km crater: 0.6, 3.9, 7.1 sec
5km crater: 0.06, 0.6, 0.8 sec
1km crater: 0.3, 1.0, 1.1 sec

I'm guessing that if I used what both of you recommended - BinarySearch instead of FindLevel - that the 1km analysis would be faster than 5km. I'll try that while I'm waiting for your help on the do-while. :)
astrostu wrote:
42km crater: 0.6, 3.9, 7.1 sec
5km crater: 0.06, 0.6, 0.8 sec
1km crater: 0.3, 1.0, 1.1 sec

I'm guessing that if I used what both of you recommended - BinarySearch instead of FindLevel - that the 1km analysis would be faster than 5km. I'll try that while I'm waiting for your help on the do-while. :)


Yup. Added print time statements around the first block, finding the longitude range. Times with FindLevel:

42km crater: 0.05, 0.2, 0.4, 3.7, 6.5 sec
5km crater: 0.0045, 0.07, 0.08, 0.8, 1.0 sec
1km crater: 0.0030, 0.28, 0.3, 1.0, 1.1 sec

Times with BinarySearch:

42km crater: 0.057, 0.058, 0.3, 3.2, 6.4 sec
5km crater: 0.0035, 0.0045, 0.09, 0.7, 0.9 sec
1km crater: 0.0027, 0.0070, 0.08, 0.7, 0.8 sec

The difference between the first two values is the time it took to find my longitude range using FindLevel versus BinarySearch. Which is why the 1-km crater took longer to do using FindLevel.
Replace the loop with a wave assignment (not tested)
    Variable n1=  numpnts(index_Temp)-1

    Pt_Lat_Temp[0,n1] = Pt_Lat_Temp[index_Temp[p]]
    Pt_Lon_Temp[0,n1] = Pt_Lon_Temp[index_Temp[p]]
    Pt_Ele_Temp[0,n1] = Pt_Ele_Temp[index_Temp[p]]
    Variable firstToDelete= n1+1
    Variable numToDelete= numpnts(Pt_Lat_Temp)-firstToDelete
    DeletePoints firstToDelete, numToDelete, Pt_Lat_Temp,Pt_Lon_Temp,Pt_Ele_Temp


Wait; now that I look at this, how can you be certain you're not overwriting the input data?

That is, when assigning Pt_Lat_Temp[ somePoint ] = something
how do you know that a later assignment isn't READING the now-changed Pt_Lat_Temp[ somePoint ]?


--Jim Prouty
Software Engineer, WaveMetrics, Inc.
I thought of that too (and it's sloppy coding, I'll admit), but by definition, my final points will be a subset of those original _Temp waves. Therefore, a point in the new one will have a higher-index corresponding point in the old one. So long as I iterate from low to high, I won't over-write needed data. I'll test your solution and get back to you in a bit ...
Your replacement for my do-while loop is faster, as I expected. I always get confused with wave indexing for this kind of vector copying. Timing:
3.61 vs 1.99 sec
0.18 vs 0.11 sec
0.17 vs 0.09 sec
P.S. So, I guess at this point, I'm at about as fast as I'm going to get. Unless someone has other ideas. And a second or two to load the next crater is fine. It was the 10+ that wasn't.

I think I'll go through some of my other code now and replace FindLevel with BinarySearch and Sort with Extract, since I do that kind of thing elsewhere.
Being rather blind on catching most of this (except for the do ... while with implicit part), my only other thought for significant speed increases is ... threadsafe functions????

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAHuntsville
Without actually sending it to different threads, does calling a function as ThreadSafe actually speed things up?
astrostu wrote:
Without actually sending it to different threads, does calling a function as ThreadSafe actually speed things up?


No.
hrodstein wrote:
astrostu wrote:
Without actually sending it to different threads, does calling a function as ThreadSafe actually speed things up?

No.

Okay, then there's no point in making this one a ThreadSafe function. This particular one has nothing in it that could be threaded, though I was thinking a possible solution to my problem originally may've been to divide the data into (e.g.) 8 chunks, sort those, then sort the 8 chunks together -- a poor man's threaded sort until Igor implements it itself. Other parts of my code are threaded, though, like calculating volume or finding points within a polygon.