parallel-processingcompiler-errorscompilationfortranmpi

MPI scatterv crashing for large N occasionally


I have my simple MPI fortran code as below. The code crashes with error forrtl: severe (174): SIGSEGV, segmentation fault occurred. I am not sure where the mistake is. The weird thing I notice here is it doesn't crash always. Sometimes for small n it works sometimes not. for some numberr of processor it works for some not. For the given example it is not working for any number of processors. The debugging in MPI is not that easy for me. Can anybody find what's wrong here?

  program crash
  use mpi
  implicit none
  integer,parameter::dp=kind(1.d0)
  integer, parameter :: M =1500,N=M,O=M! Matrix dimension
  integer myrank, numprocs,ierr, root
  integer i, j, k,l,p,local_n,sendcounts
  real(dp) R1(M),RHS(M),RHS1(M)
  real(dp),dimension(:),allocatable::local_A
  real(dp),dimension(:,:),allocatable::local_c,local_c1
  real(dp) summ,B(N,O),B1(N,O),C(M*O),C1(M)
  real(dp) final_product(M,O),rhs_product(O)
  integer,dimension(:),allocatable::displs!,displs1,displs2
  integer,dimension(:),allocatable::sendcounts_list
  real(dp),dimension(:,:),allocatable::local_A_Matrix
  integer status(MPI_STATUS_SIZE)
  integer request
  ! Initialize MPI
  call MPI_Init(ierr)
  call MPI_Comm_size(MPI_COMM_WORLD, numprocs, ierr)
  call MPI_Comm_rank(MPI_COMM_WORLD, myrank, ierr)

 B=0.d0
 do i=1,N
    do j=1,O
      B(i,j)=(i+1)*myrank+j*myrank+j*i
    enddo
 enddo


 R1=0.d0
   do i=1,N
 R1(i)=i*myrank+1
 enddo



if (myrank<numprocs-mod(M,numprocs)) then
    local_n=M/numprocs
 else
    local_n=M/numprocs+1
 endif


  sendcounts =  local_n * N
   allocate(sendcounts_list(numprocs))
   call MPI_AllGATHER(local_n, 1, MPI_INT, sendcounts_list, 1, MPI_INT,MPI_COMM_WORLD,IERR)



 if(myrank==0) then
     allocate(displs(numprocs))
       displs=0
      do i=2,numprocs
         displs(i) = displs(i-1)+N*sendcounts_list(i-1)
      enddo
  endif


  allocate(local_A(sendcounts))
  local_A=0.d0
  call MPI_Scatterv(Transpose(B),N*sendcounts_list,displs,MPI_Double,local_A,N*local_n, &
                    MPI_Double,0,MPI_COMM_WORLD,ierr)
  deallocate(sendcounts_list)           
  if(myrank==0) then
    deallocate(displs)
  endif  

   allocate(local_A_Matrix(local_n,N))
   local_A_Matrix=reshape(local_A,(/local_n,N/),order=(/2,1/))

   deallocate(local_A)





    call MPI_Finalize(ierr)

  end program crash

Solution

  • The code is working properly which is surprising to me. The idea @lastchance gave me to use -heap-arrays worked a bit. It was kind of working sometimes and no other time which was bothering me. Now the only thing I changed is allocated all the arrays no matter how small they are. This solved my problem. But still, I don't know why, and I don't care right now as I am math/physics researcher not computer scientist. Just in case anyone come across these kinds of issues please allocate all the arrays and see. Just came back to thank all of you. You guys are awesome.