While trying to implement a sine-function without a FPU I recognized that all of the input is already rational, so I decided to try an all-rational approach. Probably slower but: why not? The series is linearly convergent and hence has a chance for a runtime-optimization with binary-splitting. There is even highly detailed literature for how to do it and which I used.
So far, so good.
My prototyping tool for numerical algorithms is Pari/GP, so I ported the code from the paper mentioned above to Pari/GP and, as you might have guessed from the fact that I post a question here, it did not work. Well, it did work but it was not possible to minimize the error. The paper has several other recipes for different functions but all showed the same behavior. Assuming a typo in the paper I checked the author's implementation in CLN. Heavily optimized but is based on the code in the paper, quite verbatim even.
To get a MWE I used their recipe for exp(p/q)
(the simplest recipe besides the factorial) and simplified the Pari/GP code.
exp_bin_split_rat_internal(n1, n2, x) = {
\\ R, L, r = [P, Q, B, T]
\\ a = [p, q, b, a]
local(diff, mn, L, R, r = vector(4));
diff = n2 - n1;
if(diff == 0,
\\ no actual error-handling here
print("Error in bin_split_rat_internal: n2-n1 is zero.");
);
if( diff == 1,
\\ x = u/v
if(n1 == 0,
\\ r.P = 1;
r[1] = 1;
\\ r.Q = 1;
r[2] = 1;
\\ r.B = b(0) = 1;
r[3] = 1;
\\ r.T = a(0) * r.P = 1 * u;
r[4] = 1 * r[1];
return(r);
, \\ else
\\ r.P = u;
r[1] = numerator(x);
\\ r.Q = n1 * v;
r[2] = n1 * denominator(x);
\\ r.B = b(n) = 1;
r[3] = 1;
\\ r.T = a(n) * r.P = 1 * u;
r[4] = 1 * r[1];
return(r);
);
);
\\ floor((n1 + n2)/2)
nm = (n1 + n2)\2;
L = exp_bin_split_rat_internal(n1, nm, x);
R = exp_bin_split_rat_internal(nm, n2, x);
\\ 1 2 3 4
\\ R, L, r = [P, Q, B, T]
\\ r.P = L.P * R.P;
r[1] = L[1] * R[1];
\\r.Q = L.Q * R.Q;
r[2] = L[2] * R[2];
\\r.B = L.B * R.B;
r[3] = L[3] * R[3];
\\r.T = R.B * R.Q * L.T + L.B * L.P * R.T;
r[4] = (R[3] * R[2] * L[4]) + (L[3] * L[1] * R[4]);
return(r);
}
exp_bin_split_rat(x, n) = {
local(r, ret);
r = exp_bin_split_rat_internal(0, n, x);
\\ r = [P, Q, B, T]
\\ S = T/(B*Q)
ret = r[4]/(r[3] * r[2]);
return(ret);
}
k = 1/1234;
tmp = exp_bin_split_rat(k, 10) * 1.0;print(tmp);tmp= exp(k);print(tmp);
tmp = exp_bin_split_rat(k, 100) * 1.0;print(tmp);tmp= exp(k);print(tmp);
tmp = exp_bin_split_rat(k, 1000) * 1.0;print(tmp);tmp= exp(k);print(tmp);
tmp = exp_bin_split_rat(k, 10000) * 1.0;print(tmp);tmp= exp(k);print(tmp);
(The last one might take a little bit longer, you might skip it if you want to run it.)
As you can see, it is not possible to reduce the error further after some dozen steps.
So it is either a misunderstanding on my side of how the algorithm works or how Pari/GP works. Which one is it and why?
You are going to hate yourself/PARI for this (but love stack-overflow).
In the first line you have:
local(diff, mn, L, R, r = vector(4));
Instead of:
local(diff, nm, L, R, r = vector(4));
I would suggest using my
rather than local
.
my(diff, nm, L, R, r = vector(4));