I'm working on ScaLAPACK and trying to get used to BLACS routines which is essential using ScaLAPACK.
I've had some elementary course on MPI, so have some rough idea of MPI_COMM_WORLD stuff, but has no deep understanding on how it works internally and so on.
Anyway, I'm trying following code to say hello using BLACS routine.
program hello_from_BLACS
use MPI
implicit none
integer :: info, nproc, nprow, npcol, &
myid, myrow, mycol, &
ctxt, ctxt_sys, ctxt_all
call BLACS_PINFO(myid, nproc)
! get the internal default context
call BLACS_GET(0, 0, ctxt_sys)
! set up a process grid for the process set
ctxt_all = ctxt_sys
call BLACS_GRIDINIT(ctxt_all, 'c', nproc, 1)
call BLACS_BARRIER(ctxt_all, 'A')
! set up a process grid of size 3*2
ctxt = ctxt_sys
call BLACS_GRIDINIT(ctxt, 'c', 3, 2)
if (myid .eq. 0) then
write(6,*) ' myid myrow mycol nprow npcol'
endif
(**) call BLACS_BARRIER(ctxt_sys, 'A')
! all processes not belonging to 'ctxt' jump to the end of the program
if (ctxt .lt. 0) goto 1000
! get the process coordinates in the grid
call BLACS_GRIDINFO(ctxt, nprow, npcol, myrow, mycol)
write(6,*) 'hello from process', myid, myrow, mycol, nprow, npcol
1000 continue
! return all BLACS contexts
call BLACS_EXIT(0)
stop
end program
and the output with 'mpirun -np 10 ./exe' is like,
hello from process 0 0 0 3 2
hello from process 4 1 1 3 2
hello from process 1 1 0 3 2
myid myrow mycol nprow npcol
hello from process 5 2 1 3 2
hello from process 2 2 0 3 2
hello from process 3 0 1 3 2
Everything seems to work fine except that 'BLACS_BARRIER' line, which I marked (**) in the code's leftside.
I've put that line to make the output like below whose title line always printed at the top of the it.
myid myrow mycol nprow npcol
hello from process 0 0 0 3 2
hello from process 4 1 1 3 2
hello from process 1 1 0 3 2
hello from process 5 2 1 3 2
hello from process 2 2 0 3 2
hello from process 3 0 1 3 2
So the question goes,
I've tried BLACS_BARRIER to 'ctxt_sys', 'ctxt_all', and 'ctxt' but all of them does not make output in which the title line is firstly printed. I've also tried MPI_Barrier(MPI_COMM_WORLD,info), but it didn't work either. Am I using the barriers in the wrong way?
In addition, I got SIGSEGV when I used BLACS_BARRIER to 'ctxt' and used more than 6 processes when executing mpirun. Why SIGSEGV takes place in this case?
Thank you for reading this question.
To answer your 2 questions (in future it is best to give then separate posts)
1) MPI_Barrier, BLACS_Barrier and any barrier in any parallel programming methodology I have come across only synchronises the actual set of processes that calls it. However I/O is not dealt with just by the calling process, but at least one and quite possibly more within the OS which actually the process the I/O request. These are NOT synchronised by your barrier. Thus ordering of I/O is not ensured by a simple barrier. The only standard conforming ways that I can think of to ensure ordering of I/O are
2) Your second call to BLACS_GRIDINIT
call BLACS_GRIDINIT(ctxt, 'c', 3, 2)
creates a context for 3 by 2 process grid, so holding 6 process. If you call it with more than 6 processes, only 6 will be returned with a valid context, for the others ctxt
should be treated as an uninitialised value. So for instance if you call it with 8 processes, 6 will return with a valid ctxt
, 2 will return with ctxt
having no valid value. If these 2 now try to use ctxt
anything is possible, and in your case you are getting a seg fault. You do seem to see that this is an issue as later you have
! all processes not belonging to 'ctxt' jump to the end of the program
if (ctxt .lt. 0) goto 1000
but I see nothing in the description of BLACS_GRIDINIT that ensures ctxt will be less than zero for non-participating processes - at https://www.netlib.org/blacs/BLACS/QRef.html#BLACS_GRIDINIT it says
This routine creates a simple NPROW x NPCOL process grid. This process grid will use the first NPROW x NPCOL processes, and assign them to the grid in a row- or column-major natural ordering. If these process-to-grid mappings are unacceptable, BLACS_GRIDINIT's more complex sister routine BLACS_GRIDMAP must be called instead.
There is no mention of what ctxt
will be if the process is not part of the resulting grid - this is the kind of problem I find regularly with the BLACS documentation. Also please don't use goto
, for your own sake. You WILL regret it later. Use If ... End If
. I can't remember when I last used goto
in Fortran, it may well be over 10 years ago.
Finally good luck in using BLACS! In my experience the documentation is often incomplete, and I would suggest only using those calls that are absolutely necessary to use ScaLAPACK and using MPI, which is much, much better defined, for the rest. It would be so much nicer if ScaLAPACK just worked with MPI nowadays.