I have my simple MPI fortran code as below. The code crashes with error forrtl: severe (174): SIGSEGV, segmentation fault occurred. I am not sure where the mistake is. The weird thing I notice here is it doesn't crash always. Sometimes for small n it works sometimes not. for some numberr of processor it works for some not. For the given example it is not working for any number of processors. The debugging in MPI is not that easy for me. Can anybody find what's wrong here?
program crash
use mpi
implicit none
integer,parameter::dp=kind(1.d0)
integer, parameter :: M =1500,N=M,O=M! Matrix dimension
integer myrank, numprocs,ierr, root
integer i, j, k,l,p,local_n,sendcounts
real(dp) R1(M),RHS(M),RHS1(M)
real(dp),dimension(:),allocatable::local_A
real(dp),dimension(:,:),allocatable::local_c,local_c1
real(dp) summ,B(N,O),B1(N,O),C(M*O),C1(M)
real(dp) final_product(M,O),rhs_product(O)
integer,dimension(:),allocatable::displs!,displs1,displs2
integer,dimension(:),allocatable::sendcounts_list
real(dp),dimension(:,:),allocatable::local_A_Matrix
integer status(MPI_STATUS_SIZE)
integer request
! Initialize MPI
call MPI_Init(ierr)
call MPI_Comm_size(MPI_COMM_WORLD, numprocs, ierr)
call MPI_Comm_rank(MPI_COMM_WORLD, myrank, ierr)
B=0.d0
do i=1,N
do j=1,O
B(i,j)=(i+1)*myrank+j*myrank+j*i
enddo
enddo
R1=0.d0
do i=1,N
R1(i)=i*myrank+1
enddo
if (myrank<numprocs-mod(M,numprocs)) then
local_n=M/numprocs
else
local_n=M/numprocs+1
endif
sendcounts = local_n * N
allocate(sendcounts_list(numprocs))
call MPI_AllGATHER(local_n, 1, MPI_INT, sendcounts_list, 1, MPI_INT,MPI_COMM_WORLD,IERR)
if(myrank==0) then
allocate(displs(numprocs))
displs=0
do i=2,numprocs
displs(i) = displs(i-1)+N*sendcounts_list(i-1)
enddo
endif
allocate(local_A(sendcounts))
local_A=0.d0
call MPI_Scatterv(Transpose(B),N*sendcounts_list,displs,MPI_Double,local_A,N*local_n, &
MPI_Double,0,MPI_COMM_WORLD,ierr)
deallocate(sendcounts_list)
if(myrank==0) then
deallocate(displs)
endif
allocate(local_A_Matrix(local_n,N))
local_A_Matrix=reshape(local_A,(/local_n,N/),order=(/2,1/))
deallocate(local_A)
call MPI_Finalize(ierr)
end program crash
The code is working properly which is surprising to me. The idea @lastchance gave me to use -heap-arrays worked a bit. It was kind of working sometimes and no other time which was bothering me. Now the only thing I changed is allocated all the arrays no matter how small they are. This solved my problem. But still, I don't know why, and I don't care right now as I am math/physics researcher not computer scientist. Just in case anyone come across these kinds of issues please allocate all the arrays and see. Just came back to thank all of you. You guys are awesome.