fortrancompiler-optimizationgfortran

gfortran performs unsafe math optimizations of cos(atan(x))


I'm researching an issue where gfortran returns in rare cases different result depending on optimization level even with compile options -fno-fast-math and -fno-unsafe-math-optimizations so that result should not depend on optimization level. I tested this with gfortran-11.2.1, gfortran-11.4.1 (both RedHat) and gfortran-14.2 (godbolt). For reproducible results I also need to use OpenLibM math library instead of gcc math library, so I will continue discussion on godbolt disassembly.

The result difference comes from the calculation y = cos(atan(x)) which is replaced by y = 1/sqrt(1+x^2) for -O1 or higher: compile

double precision function foo(x)
implicit none
double precision, intent(in) :: x
foo = cos(atan(x))
end function foo

with gfortran -c file.f90 -O1, see https://godbolt.org/z/c9b4rvhdW and check created code: no call to cos or atan, but instead sqrt is used. If -O0 is used the atan and cos functions are used.

From documentation I would expect that -fno-fast-math or -fno-unsafe-math-optimizations should disable this optimization, but it looks like these have no effect on gfortran. Only -fno-tree-forwprop disables this optimizations so that same results as with -O0 are returned.

For gcc C compiler behaves as expected: for the C variant of same code (https://godbolt.org/z/E7Mzh6Taq) this optimization is only done by compiler if -ffast-math is specified, but not for default -O1.

Is using -O0 or '-fno-tree-forwprop` the only option to disable these math optimizations? I'd like to be sure to get identical results for all optimization levels but keeping performance as good as possible.


Solution

  • What prevents you from redefining built-in functions with standard Fortran language tools? Define a module and use it everywhere, especially since you are going to use the OpenLibM implementation.

    module c_replace_builtins
        implicit none
    
        interface 
            function atan(x) bind(c)
                use iso_c_binding
                real(c_double) atan
                real(c_double), value :: x
            end function
    
            function cos(x) bind(c)
                use iso_c_binding
                real(c_double) cos
                real(c_double), value :: x
            end function
        end interface
    end module c_replace_builtins
    
    program test
        implicit none
    
        integer :: i
        double precision :: x
    
        do i = 0, 10
            x = 10.d0**(-i)
            print *, x, foo(x) - cos(atan(x))
        end do
    
    contains
        double precision function foo(x)
            use c_replace_builtins
                    ! We use it only here so that the built-in
                    ! functions in the main program do not change.
    
            implicit none
            double precision, intent(in) :: x
            foo = cos(atan(x))
        end function foo
    end program test
    

    P.S.

    To prevent optimization of calls to built-in functions, you can do the same and define an external module that overrides the built-in functions. This will probably be no less efficient than using an intermediate volatile variable, and most likely it will be more efficient.

    f_replace_builtins.f90:

    module f_builtins
        implicit none
    
    contains
        double precision function builtin_atan(x)
            double precision, intent(in) :: x
            builtin_atan = atan(x)
        end function
    
        double precision function builtin_cos(x)
            double precision, intent(in) :: x
            builtin_cos = cos(x)
        end function
    end module f_builtins
    
    module f_replace_builtins
        use f_builtins
    
        implicit none
    
    contains
        double precision function atan(x) result(res)
            double precision, intent(in) :: x
            res = builtin_atan(x)
        end function
    
        double precision function cos(x) result(res)
            double precision, intent(in) :: x
            res = builtin_cos(x)
        end function
    end module f_replace_builtins