ctaylor-series

Error of Taylor Series e^x for negative x


I was calculating e^x using Taylor Series and noticed that when we calculate it for negative x absolute error is large.Is it because we don't have enough precision to calculate it?

(I know that to prevent it we can use e^(-x)=1/e^x)

#include <stdio.h>
#include <math.h>

double Exp(double x);

int main(void)
{
    double x;
    printf("x="); 
    scanf("%le", &x);

    printf("%le", Exp(x)); 
    return 0;
}

double Exp(double x)
{
    double h, eps = 1.e-16, Sum = 1.0; 
    int i = 2;

    h = x; 

    do
    {
        Sum += h; 
        h *= x / i;
        i++;
    } while (fabs(h) > eps);

    return Sum ; 
}

For example: x=-40 the value is 4.24835e-18 but programm gives me 3.116952e-01.The absolute error is ~0.311

x=-50 the value is 1.92875e-22 programm gives me 2.041833e+03.The absolute error is ~2041.833


Solution

  • The problem is caused by rounding errors at the middle phase of the algorithm. The h is growing quickly as 40/2 * 40/3 * 40 / 4 * ... and oscillating in sign. The values for i, h and Sum for x=-40 for consecutive iterations can be found below (some data points omitted for brevity):

    x=-40
    i=2 h=800 Sum=-39
    i=3 h=-10666.7 Sum=761
    i=4 h=106667 Sum=-9905.67
    i=5 h=-853333 Sum=96761
    i=6 h=5.68889e+06 Sum=-756572
    ...
    i=37 h=-1.37241e+16 Sum=6.63949e+15
    i=38 h=1.44464e+16 Sum=-7.08457e+15
    i=39 h=-1.48168e+16 Sum=7.36181e+15
    i=40 h=1.48168e+16 Sum=-7.45499e+15
    i=41 h=-1.44554e+16 Sum=7.36181e+15
    i=42 h=1.37671e+16 Sum=-7.09361e+15
    i=43 h=-1.28066e+16 Sum=6.67346e+15
    i=44 h=1.16423e+16 Sum=-6.13311e+15
    i=45 h=-1.03487e+16 Sum=5.50923e+15
    i=46 h=8.99891e+15 Sum=-4.83952e+15
    ...
    i=97 h=-2610.22 Sum=1852.36
    i=98 h=1065.4 Sum=-757.861
    i=99 h=-430.463 Sum=307.534
    ...
    i=138 h=1.75514e-16 Sum=0.311695
    i=139 h=-5.05076e-17 Sum=0.311695
    3.116952e-01
    

    The peak magnitude of sum is 7e15. This is where the precision is lost. Type double can be represented with about 1e-16 accuracy. This gives expected absolute error of about 0.1 - 1. As the expected sum (value of exp(-40) is close to zero the final absolute error is close to the maximal absolute error of the partial sums.

    For x=-50 the peak value of sum is 1.5e20 what gives the absolute error due to finite representation of double at about 1e3 - 1e4 what is close to observed one.

    Not much can be fixed without significant changes to algorithm to avoid forming those partial sums. Alternatively, compute exp(-x) as 1/exp(x).