calgorithmmathtrigonometry

Fast & accurate atan/arctan approximation algorithm


Any fast & accurate atan/arctan approximation function/algorithm out there? Input: x = (0, 1] (minimum x requirement). Output: in radians

double FastArcTan(double x)
{
    return M_PI_4*x - x*(fabs(x) - 1)*(0.2447 + 0.0663*fabs(x));
}

I've found this function above online but it gave a max error of 1.6 radians which is too big.


Solution

  • OP reported a max error of 1.6 radians (92 degrees) which is inconsistent with testing OP's code below, max error input x: 0 to 1 range of about 0.0015 radians. I suspect a mis-code or testing outside the range of 0...1. Did OP means 1.6 milli-radians?


    Perhaps a more accurate a*x^5 + b*x^3 + c*x will still be fast enough for OP. It is about 4x more accurate on my machine on average and 2x better worst case. It uses an optimal 3-term polynomial as suggested by @Foon and @Matthew Pope

    #include <math.h>
    #include <stdio.h>
    
    #ifndef M_PI_4
    #define M_PI_4 (3.1415926535897932384626433832795/4.0)
    #endif
    
    double FastArcTan(double x) {
      return M_PI_4*x - x*(fabs(x) - 1)*(0.2447 + 0.0663*fabs(x));
    }
    
    #define A 0.0776509570923569
    #define B -0.287434475393028
    #define C (M_PI_4 - A - B)
    #define FMT "% 16.8f"
    
    double Fast2ArcTan(double x) {
      double xx = x * x;
      return ((A*xx + B)*xx + C)*x;
    }
    
    int main() {
      double mxe1 = 0, mxe2 = 0;
      double err1 = 0, err2 = 0;
      int n = 100;
      for (int i=-n;i<=n; i++) {
        double x = 1.0*i/n;
        double y = atan(x);
        double y_fast1 = FastArcTan(x);
        double y_fast2 = Fast2ArcTan(x);
        printf("%3d x:% .3f y:" FMT "y1:" FMT "y2:" FMT "\n", i, x, y, y_fast1, y_fast2);
        if (fabs(y_fast1 - y) > mxe1 ) mxe1  = fabs(y_fast1 - y);
        if (fabs(y_fast2 - y) > mxe2 ) mxe2  = fabs(y_fast2 - y);
        err1 += (y_fast1 - y)*(y_fast1 - y);
        err2 += (y_fast2 - y)*(y_fast2 - y);
      }
      printf("max error1: " FMT "sum sq1:" FMT "\n", mxe1, err1);
      printf("max error2: " FMT "sum sq2:" FMT "\n", mxe2, err2);
    }
    

    Output

     ...
     96 x: 0.960 y:      0.76499283y1:      0.76582280y2:      0.76438526
     97 x: 0.970 y:      0.77017091y1:      0.77082844y2:      0.76967407
     98 x: 0.980 y:      0.77529750y1:      0.77575981y2:      0.77493733
     99 x: 0.990 y:      0.78037308y1:      0.78061652y2:      0.78017777
    100 x: 1.000 y:      0.78539816y1:      0.78539816y2:      0.78539816
    max error1:       0.00150847sum sq1:      0.00023062
    max error2:       0.00084283sum sq2:      0.00004826
    

    Unclear why OP's code uses fabs() given "Input: x = (0, 1]".


    [Update 2024]

    For low precision functions such as double FastArcTan(double x), we can optimize the error at least 3 ways: error, relative error and unit in the last place error. With float x values in the range [0 ... 1.0], we can test every float and report the worst case.

    Error is simply fast_f(x) - reference(x).

    Relative error is (fast_f(x) - reference(x))/reference(x) with special consideration when reference(x) is 0 or a sub-normal number - using (fast_f(x) - reference(x))/FLT_MIN in those cases.

    Unit in the last place (ULP) error is the the number of float away from the best answer. Example. It is approximately (fast_f(x) - reference(x))/round_down_to_a_power_of_2(reference(x)).

    OP, with "gave a max error of 1.6 radians" wants to optimize the error.

    In my experience, usually it is the relative error or ULP error that is desired to minimize.

    This update reviews code to see if we can do better with some restrictions:

    Test results:

                           :     radians,           %,         ULP
    atanf (C library)      : 0.000000051, 0.000008442,       0.852
    FastArcTan -OP         : 0.001508884, 5.882352941,  490209.000
    Fast2ArcTan-This answer: 0.000843262, 0.961538462,   80838.000
    atan_abs3-optimize err : 0.000704068, 1.041666667,   87799.000
    atan_rel3-optimize rel : 0.001353447, 0.370370370,   31179.000
    atan_ulp3-optimize UDP : 0.001581718, 0.315457413,   26536.822
    atan_ulp9-optimize UDP : 0.000000095, 0.000012706,       1.592
    atan_jw    -jw         : 0.000157066, 0.019998672,    2635.133
    atan_jw_alt-optimize   : 0.000006579, 0.001461326,     111.129
    

    atan_abs3() achieves a better result than Fast2ArcTan() for OP's goals.

    I recommend atan_ulp3() for those wanting a few term low relative error.


    float atan_abs3(float x) {
      const float a[3] = { //
          0.994766756708199f, -2.8543851807526100E-01f, 0.0760699247645105f};
      float xx = x * x;
      return ((a[2] * xx + a[1]) * xx + a[0]) * x;
    }
    
    float atan_rel3(float x) {
      const float a[3] = { //
          0.998141572179073f, -2.9774125603673200E-01f, 0.0849978472551071f};
      float xx = x * x;
      return ((a[2] * xx + a[1]) * xx + a[0]) * x;
    }
    float atan_ulp3(float x) {
      const float a[3] = { // 
          0.998418889819911f, -2.9993501171084700E-01f, 0.0869142852883849f};
      float xx = x * x;
      return ((a[2] * xx + a[1]) * xx + a[0]) * x;
    }
    
    float atan_ulp9(float x) {
      static const float a[9] = { {1.0f, -0.333331728467737f, 0.199940412794435f,
          -0.142123340834229f, 0.10668127080775f, -0.0755120841589429f,
          0.0431408641542157f, -0.0162911733512761f, 0.00289394245323327f}};
      float xx = x * x;
      float sum = 0.0;
      for (unsigned i = 9; i-- > 0;) {
        sum = sum * xx + a[i];
      }
      return sum * x;
    }
    
    float atan_jw(float x) {
      return 8 * x / (3 + sqrtf(25 + 80.0f / 3.0f * x * x));
    }
    
    float atan_jw_alt(float x) {
      return 8.430893743524f * x / (3.2105332277903100f + sqrtf(27.2515970979709f + 29.3591908371266f * x * x));
    }