fortranopenmp

Why Segmentation fault is happening in this openmp code?


main program:

program main                                                                                                                                                    
  use omp_lib                                                                                                                                                   
  use my_module                                                                                                                                                 
  implicit none                                                                                                                                                 

  integer, parameter :: nmax = 202000                                                                                                                           
  real(8) :: e_in(nmax) = 0.D0                                                                                                                                  
  integer i                                                                                                                                                     

call omp_set_num_threads(2)                                                                                                                                     
!$omp parallel default(firstprivate)                                                                                                                            
!$omp do                                                                                                                                                        
  do i=1,2                                                                                                                                                      
     print *, e_in(i)                                                                                                                                           
     print *, eTDSE(i)                                                                                                                                          
  end do                                                                                                                                                        
!$omp end do                                                                                                                                                    
!$omp end parallel                                                                                                                                              
end program main

module:

module my_module                                                                                                                                                
  implicit none                                                                                                                                                 

  integer, parameter, private :: ntmax = 202000                                                                                                  
  double complex :: eTDSE(ntmax) = (0.D0,0.D0)                                                                                                                  
!$omp threadprivate(eTDSE)                                                                                                                                      

end module my_module

compiled using:

ifort -openmp main.f90 my_module.f90

It gives the Segmentation fault when execution. If remove one of the print commands in the main program, it runs fine. Also if remove the omp function and compile without -openmp option, it runs fine too.


Solution

  • The most probable cause for this behaviour is that your stack size limit is too small (for whatever reason). Since e_in is private to each OpenMP thread, one copy per thread is allocated on the thread stack (even if you have specified -heap-arrays!). 202000 elements of REAL(KIND=8) take 1616 kB (or 1579 KiB).

    The stack size limit can be controlled by several mechanisms:

    Note that thread stacks are actually allocated with the size set by *_STACKSIZE (or to the default value), unlike the stack of the main thread, which starts small and then grows on demand up to the set limit. So don't set *_STACKSIZE to an arbitrary large value otherwise you may hit the process virtual memory size limit.

    Here are some examples:

    $ ifort -openmp my_module.f90 main.f90
    

    Set the main stack size limit to 1 MiB (the additional OpenMP thread would get 4 MiB as per default):

    $ ulimit -s 1024
    $ ./a.out
    zsh: segmentation fault (core dumped)  ./a.out
    

    Set the main stack size limit to 1700 KiB:

    $ ulimit -s 1700
    $ ./a.out
      0.000000000000000E+000
     (0.000000000000000E+000,0.000000000000000E+000)
      0.000000000000000E+000
     (0.000000000000000E+000,0.000000000000000E+000)
    

    Set the main stack size limit to 2 MiB and the stack size of the additional thread to 1 MiB:

    $ ulimit -s 2048
    $ KMP_STACKSIZE=1m ./a.out
    zsh: segmentation fault (core dumped)  KMP_STACKSIZE=1m ./a.out
    

    On most Unix systems the stack size limit of the main thread is set by PAM or other login mechanism (see /etc/security/limits.conf). The default on Scientific Linux 6.3 is 10 MiB.

    Another possible scenario that can lead to an error is if the virtual address space limit is set too low. For example, if the virtual address space limit is 1 GiB and the thread stack size limit is set to 512 MiB, then the OpenMP run-time would try to allocate 512 MiB for each additional thread. With two threads one would have 1 GiB for the stacks only, and when the space for code, shared libraries, heap, etc. is added up, the virtual memory size would grow beyond 1 GiB and an error would occur:

    Set the virtual address space limit to 1 GiB and run with two additional threads with 512 MiB stacks (I have commented out the call to omp_set_num_threads()):

    $ ulimit -v 1048576
    $ KMP_STACKSIZE=512m OMP_NUM_THREADS=3 ./a.out
    OMP: Error #34: System unable to allocate necessary resources for OMP thread:
    OMP: System error #11: Resource temporarily unavailable
    OMP: Hint: Try decreasing the value of OMP_NUM_THREADS.
    forrtl: error (76): Abort trap signal
    ... trace omitted ...
    zsh: abort (core dumped)  OMP_NUM_THREADS=3 KMP_STACKSIZE=512m ./a.out
    

    In this case the OpenMP run-time library would fail to create a new thread and would notify you before it aborts program termination.