Accessing numeric wave data, waveaccess sample XOP

I've been studying the XOP instruction manual and came to the part discussing accessing numeric wave data using the waveaccess XOP. (ch7, pg160)
I found substantially different results using the direct, point and storage methods of memory access.

The example had results of .031 .103 and .176 ms access times on a 2010 macbookpro 2.66 ghz I7 win7-64 machine
My results were --------- .376 .377 and .386 ms access times on a 2011 lenovo m91p 3.33 ghz i5 win7-64 machine

IGOR pro 32 6.3.2.3
Compiling as release vs debug had little effect. VS12 express

So...why the 10x difference on direct memory access?
I checked debugging and it was off
I had virus scan disabled
I had internet port disabled.

Is there something incorrect in my installation which is hobbling IGORPRO? Or is this an issue with running on Lenovo machines instead of MACs?

data
---------------------------
•Make/N=(50,50,50) wave3D
•Variable timerRefNum

----waveaccess.xop compiled with debug---
•Make/N=(50,50,50) wave3D
•Variable timerRefNum
•timerRefNum = StartMSTimer
•WAFill3DWaveDirectMethod(wave3D)
•Print StopMSTimer(timerRefNum)/1e6
0.37694
•timerRefNum = StartMSTimer
•WAFill3DWavePointMethod(wave3D)
•Print StopMSTimer(timerRefNum)/1e6
0.377144
•timerRefNum = StartMSTimer
•WAFill3DWaveStorageMethod(wave3D)
•Print StopMSTimer(timerRefNum)/1e6
0.386606

----waveaccess.xop compiled with release---
•Make/N=(50,50,50) wave3D
•Variable timerRefNum
•timerRefNum = StartMSTimer
•WAFill3DWaveDirectMethod(wave3D)
•Print StopMSTimer(timerRefNum)/1e6
0.376838
•timerRefNum = StartMSTimer
•WAFill3DWavePointMethod(wave3D)
•Print StopMSTimer(timerRefNum)/1e6
0.377077
•timerRefNum = StartMSTimer
•WAFill3DWaveStorageMethod(wave3D)
•Print StopMSTimer(timerRefNum)/1e6
0.386917


Ref from xop manual
--------------------------
Speed Comparisons
The WaveAccess sample XOP implements routines that fill a 3D wave using each of the wave access
methods described above. To try them, compile the WaveAccess XOP and install it in the Igor Extensions
folder. Launch Igor Pro and execute the following commands:
Make/N=(50,50,50) wave3D
Variable timerRefNum
// Enter the next three lines as one command line
timerRefNum = StartMSTimer
WAFill3DWaveDirectMethod(wave3D)
Print StopMSTimer(timerRefNum)/1e6
Repeat replacing WAFill3DWaveDirectMethod with WAFill3DWavePointMethod and
WAFill3DWaveStorageMethod.

The following times were recorded for filling a 200x200x200 double-precision wave using the release build
of WaveAccess. All tests were run on a 2010-vintage MacBook Pro with an Intel i7 laptop processor running
at 2.66GHz

Operating System Direct Access Temp Storage Point Access
Mac OS X 10.6.3 0.018 s 0.086 s 0.214 s
Windows 7/64 0.031 0.103 0.176
Windows 7/64 Under VMWave Fusion 0.037 0.124 0.181
I think the problem is that you did not follow this instruction:
// Enter the next three lines as one command line


Instead you executed the commands one-at-a-time in the command line. This means you were timing how long it took you to type and execute the WAFill3DWaveDirectMethod command, not how long it took to execute it.

Execute it like this:
timerRefNum = StartMSTimer; WAFill3DWaveDirectMethod(wave3D); Print StopMSTimer(timerRefNum)/1e6


Better yet would be to put it into a function:
Function Test()
    Make/O/N=(50,50,50) wave3D
    Variable timerRefNum = StartMSTimer
    WAFill3DWaveDirectMethod(wave3D)
    Variable elapsedTime = StopMSTimer(timerRefNum)/1e6
    Print elapsedTime
End


Since functions are compiled it does not matter if you put the commands in one line or multiple lines.

I thought pasting in all three lines at the command line was equivalent to executing as a single line. I reran with “;” which gets the following results:
Lenovo M91p I5 3.33 ghz win7-64 igor 6.3.2.3
Direct = 34.9ms, Point = 40.2ms, Storage=37.5ms

XOP manual pg 160 lists results on a 2010 mac I7 2.66 ghz win7-64
Direct = 31ms, Point 176ms, Storage=86ms

*------------------------*
Access time improves dramatically if run 10,000X in a recursive function to remove "print" overhead timing, John Weeks' suggestion.
Direct = 0.080ms, Point = 2.47ms, Storage=0.541ms
*------------------------*

...and on a Mac?

------------
Function with 10k recursion
Output
•testdirect()
direct sec= 7.97278e-05 , point sec= 0.00246937 , memory sec= 0.000541131
•testdirect()
direct sec= 7.91245e-05 , point sec= 0.00247078 , memory sec= 0.000541233

Function testDirect()

Make/O /N=(50,50,50) wave3D
Variable timerRefNum1
Variable timerRefNum2
Variable timerRefNum3
Variable i, j1, j2, j3
timerRefNum1 = StartMSTimer
for (i = 0; i < 1e4; i += 1)
WAFill3DWaveDirectMethod(wave3D)
endfor
j1=StopMSTimer(timerRefNum1)/(1e10)
//
//
Make/O /N=(50,50,50) wave3D
i = 0
timerRefNum2 = StartMSTimer
for (i = 0; i < 1e4; i += 1)
WAFill3DWavePointMethod(wave3D)
endfor
j2=StopMSTimer(timerRefNum2)/(1e10)
//
//
Make/O /N=(50,50,50) wave3D
i = 0
timerRefNum3 = StartMSTimer
for (i = 0; i < 1e4; i += 1)
WAFill3DWaveStorageMethod(wave3D)
endfor
j3=StopMSTimer(timerRefNum3)/(1e10)
Print "direct sec= ",j1,", point sec= ",j2,", memory sec= ",j3
End