Differences in cumulative sum.

OK- I tracked down the code that does interpolated look-ups from waves. The problem is pretty certainly floating-point truncation errors. In the test case you posted, the numbers involved are on the order of 40,000 and the negative number is -7.27596e-12, or one part in 5e15.

The interpolation is not done in the straightforward way involving a subtraction; it's done as a weighted average. Here's the code:

A= lhpoint+1-pnum;
B= 1-A;

DBL(dest[0])= A*DBL(dest[0]) + B*DBL(rhYval[0]);

On the right side, dest has already been filled with the number from the wave point below the requested fractional point. rhYval contains the value from the wave point above. Don't worry about the DBL macro, it is involved with adapting the code to different number types and games to make things fast. The point here is that even if dest and rhYval contain identical numbers, the arithmetic involved cannot guarantee to add up exactly to the original number.

I'm sure that this is done to save some floating-point arithmetic, and the time that would take.

You basically have three choices:

1) Do the interpolation yourself, using a subtraction of the two point values as the first step. If the numbers are truly identical, it will result in a clean zero and subsequent computations should come out to exactly zero.

2) Check for the result being some acceptably small fraction of the numbers you're subtracting, and if small, then assign a clean zero.

3) Simply take the absolute value. That will result in some numbers that are vanishingly small and positive rather than being vanishingly small and negative.

I haven't asked Larry, but I suspect he would resist changing our code because the problem is rare and only consequential in rare cases, and it would add extra operations in a fundamental operation that is often in the innermost part of a loop, where execution speed is required, and changing it would cause computations that people have used in the past to change in subtle ways.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

Log in or register to post comments

June 20, 2011 at 10:36 am - Permalink

andyfaff

johnweeks wrote:
OK- I tracked down the code that does interpolated look-ups from waves. The problem is pretty certainly floating-point truncation errors. In the test case you posted, the numbers involved are on the order of 40,000 and the negative number is -7.27596e-12, or one part in 5e15.

the numbers are on that magnitude, but I'm interested in differences between nearby points, 4 orders of magnitude lower.

johnweeks wrote:
The interpolation is done as a weighted average.
A= lhpoint+1-pnum;
B= 1-A;
DBL(dest[0])= A*DBL(dest[0]) + B*DBL(rhYval[0]);

On the right side, dest has already been filled with the number from the wave point below the requested fractional point. rhYval contains the value from the wave point above.

I'm sure that this is done to save some floating-point arithmetic, and the time that would take.

As far as I can tell, with the formula you show you have you have 2 additions, 2 subtractions, 2 multiplications.
If you do it like this:
dest[0] = dest[0] + pnum * (rhYval[0] - dest[0])
you have 1 multiplication, 1 addition, 1 subtraction. This is three floating point operations less (I'm assuming that pnum is the fraction of the point that is required to add on).
I'm not trying to make trouble, only to understand.

John Weeks wrote:
3) Simply take the absolute value. That will result in some numbers that are vanishingly small and positive rather than being vanishingly small and negative.

I've done this.

John Weeks wrote:
the problem is rare and only consequential in rare cases, and it would add extra operations in a fundamental operation that is often in the innermost part of a loop, where execution speed is required, and changing it would cause computations that people have used in the past to change in subtle ways.

I disagree here. If I came across the problem, other people should've, would've seen this. Perhaps I don't understand why, but I don't think it takes more fp operations. Whilst it may cause computations that people have used in the past to change slightly, one could argue they were probably being done in slightly the wrong way. Whilst i agree that the magnitude of the problem is vanishingly small, the sign really does matter where someone is going to apply a function that will result in undefined behaviour, like a sqrt or log.

Log in or register to post comments

June 21, 2011 at 05:23 pm - Permalink

johnweeks

andyfaff wrote:
As far as I can tell, with the formula you show you have you have 2 additions, 2 subtractions, 2 multiplications.
If you do it like this:
dest[0] = dest[0] + pnum * (rhYval[0] - dest[0])
you have 1 multiplication, 1 addition, 1 subtraction. This is three floating point operations less (I'm assuming that pnum is the fraction of the point that is required to add on).
I'm not trying to make trouble, only to understand.

Except that it doesn't work... You need to replace pnum with a fraction of the point difference. That will add three more operations:
frac = (p - p0)/(p1 - p0)

Another thing to think about is that any scheme that involves subtracting the two Y values, and then adding something to one of the Y values will suffer from floating-point truncation. It's the small differences of large numbers problem- the difference between the Y values is likely to be much smaller than the Y values themselves. So the weighted average scheme should be more accurate in general, since multiplication involves only the fractional last-bit error, whereas subtraction and then addition suffers from an error that gets larger as the numbers get closer.

Quote:

John Weeks wrote:
the problem is rare and only consequential in rare cases, and it would add extra operations in a fundamental operation that is often in the innermost part of a loop, where execution speed is required, and changing it would cause computations that people have used in the past to change in subtle ways.

I disagree here. If I came across the problem, other people should've, would've seen this. Perhaps I don't understand why, but I don't think it takes more fp operations. Whilst it may cause computations that people have used in the past to change slightly, one could argue they were probably being done in slightly the wrong way. Whilst i agree that the magnitude of the problem is vanishingly small, the sign really does matter where someone is going to apply a function that will result in undefined behaviour, like a sqrt or log.

I'm afraid the bottom line is that depending on floating-point operations being accurately zero, or having accurate sign near zero is something you just shouldn't do. The message in this case is that floating-point truncation can bite you unexpectedly.

John Weeks
WaveMetrics, Inc.
support@wavemetrics.com

Log in or register to post comments

June 22, 2011 at 09:12 am - Permalink