A 21-year bug

2014.06.19

This article expresses the author's opinion at the time of writing. There are no guarantees of correctness, originality or current relevance. Copying the whole article is forbidden. Transcription of selected parts is allowed provided that author and this source are mentioned.

One of the Murphy's Laws says: every non-trivial program has at least one bug.

Recently, I found a bug in my financial calculator that has been there since the very first version, launched in 2007. The bug affected one of the most often used functions of the calculator: the "i" key (interest).

Still, nobody seemed to note it, until very recently. One user sent me a concrete scenario that exercised the bug.

Actually, the affected piece of code is older than 2007: it was ported from a Java-based
financial calculator written in 1997. And *that* one was
already a port. The original code was written in 1993, either in xBase or Visual Basic
for Access (I simply can't remember).

So, the bug was lurking for 21 years!

Fortunately, the bug did not yield misleading results, e.g. a wrong interest rate. It just regarded some valid financial operations as "Unsolvable" (Error 5 in HP-12C) while they are indeed solvable.

Let's explore the roots of this bug. Most financial calculators dissect a financial operation in five parameters: PV (Present Value), I (interest), N (number of payments), PMT (value of each payment), and FV (future value). These five numbers need to be in balance with each other. The user types four variables, and asks the calculator to find the fifth one that balances the operation.

There are closed-form solutions to solve for every parameter, except for the interest. On top of that, the flags BEGIN and C change these formulas. So, for example, there are four closed-form solutions for PMT, because there are four possible flag combinations.

On the other hand, there is a single closed-form formula, the NPV (Net Present Value), that ties together all parameters. Leaving BEGIN and C aside for a moment to simplify our discussion, the NPV formula is:

NPV = PV + FV/(1+I)^{n} + Σ_{x=1..n} PMT/(1+I)^{x}

NPV is zero when the financial operation is balanced. For example, if N=3, I=9.701%, PV=-500, PMT=200 and FV=0:

NPV = -500 + 200/1.09701 + 200/1.20343 + 200/1.32018

NPV = -500 + 182.31 + 166.19 + 151.49

NPV = -0.01 (near enough zero)

All parameters can be solved by finding the root of NPV — no need of any other formula whatsoever. Actually, this is the only way to find the interest of a financial operation, since there are no closed-form solutions for interest.

My code employed the secant method to find the NPV root, a method similar to linear interpolation. (It would be more appropriate to call it "linear extrapolation".)

In case you don't know how the secant method works. Suppose we have N=3, I=9.701%, PV=-500, FV=0, and we want to find PMT. First, we make two initial guesses of PMT, for example 0 and 10:

(PMT=0) NPV = -500 + 0/1.09701 + 0/1.20343 + 0/1.32018 NPV = -500 (wrong) (PMT=10) NPV = -500 + 10/1.09701 + 10/1.20343 + 10/1.32018 NPV = -475.00 (wrong)

Now we estimate the "best" value for PMT that should make NPV equal to 0, based on these initial guesses, using linear extrapolation:

Line equation: y = ax + b, root = -b/a a = (npv2 - npv1) / (guess2 - guess1) a = (-475 - (-500)) / (10 - 0) a = 2.5 b = npv1 - guess1 * a b = -500 - 0 * a b = -500 PMT = -b/a PMT = -(-500) / 2.5 PMT = 200

For PMT=200, NPV is equal to 0.0002, which means a balanced equation up to the 4th digit of precision. If we need even more precision, we redo the linear extrapolation with the last two guesses:

a = (npv2 - npv1) / (guess2 - guess1) a = (0.0002 - (-475)) / (200 - 10) a = 2.500001052631579 b = npv1 - guess1 * a b = -475 - 10 * a b = -500.0000105263158 PMT = -b/a PMT = -199.9999200000337

The latest estimation of PMT balances the equation up to the 5th digit of precision. It is just a matter of repeating the algorithm until the desired precision is achieved (that is, NPV is 0.0000... for a given number of decimal digits).

The secant method is very fast, simple to implement, and seemed to work for any parameter. Indeed it does work for PV, PMT, FV and N, because NPV shows a monotonic, and mostly linear, response to variations on those parameters.

*
("Monotonic" means that the
result always increases, or always decreases, given the increase of an
independent variable. For example, if PV increases, NPV result will guaranteedly increase,
so NPV is monotonic in relation to PV. The PV-PMT relationship is also perfecty linear.
On the other hand, NPV is not linear in relation to N. Given a big enough N, further increases
in N will barely affect the NPV result.)
*

For I (interest), the story is different. **Normally,** NPV has
a monotonic response to variations on interest, but there are cases
when this is not true.

One problematic case: when N is big and I is relatively small (near zero). Depending on initial estimates of I, the secant algorithm keeps bouncing between big positive and negative values of I, never finding the correct value.

This can be a problem even if FV=0, a case in which NPV is monotonic in relation to variations in I. It is simple to remedy the secant algorithm to avoid this particular pitfall, but unfortunately there is another problem.

When PV is zero and FV is big, given a big N and a small I, the NPV is no longer monotonic in relation to interest; the NPV(I) curve has a maximum value, a "peak". There are two roots for NPV: the true interest rate, and an infinite interest rate.

An infinite interest is a "solution" for NPV when PV=0 because it discounts both PMT and FV down to zero. This can't happen when PV is not nil, since PV is not discounted by the interest rate.

Depending on the true interest rate and the initial guesses, the secant method may choose to pursue the infinite root, never finds it, and gives up. The crux of the bug is: NPV(I) is not monotonic, and the secant method cannot be used in this case.

The fix for this problem is to use a different root-finding algorithm for NPV, at least when interest is the parameter to be found. The basic technique is the same as employed in HP-15C's SOLVE function:

- Find a range of interest rates that contain positive
**and**negative NPVs, so there must be a root within this range (when NPV changes from negative to positive or vice-versa); - Use bisection to pinpoint the root (that is, the interest rate that makes NPV=0).

Finding the range is the difficult part, but there are some shortcuts available. For example, the interest rate cannot be smaller than -100%, so the range search does not need to go below that. Pathological financial scenarios that would be "solvable" by an infinite interest rate, are rejected beforehand.

Bisection is a simple divide-and-conquer: if the interest range is 0 to 1, try interest 0.5. If NPV(0.5) is positive, the root is between 0 and 0.5. Then, test NPV(0.25), and so on. Stop when NPV(i) is small enough. It is slow, the result converges a single bit per cycle, but it is guaranteed to find the root. (Brent's method is an improved bisection that uses secant method to boost convergence when it is safe to do it.)

I have 3 or 4 applications that depended on this buggy code. Funny thing is, given a problematic operation, some apps still did find the correct interest. One reason is that different apps started with different initial estimations, so the secant method in each application had a different failure profile.

Another reason is that every application had a different request of precision (how close to zero the root of NPV must be to be accepted as "solved"). More precision means more iterations within the secant method, and more chances to get confused with a complicated NPV curve.

Indeed, the financial calculator app (where the bug was reported) had been modified recently to increase precision of financial results. This change may have caused the bug to show up. This is a lesson how any software modification, even the most innocent or trivial, can trigger a malfunction.