parallel-processingfortranfortran-coarrays

Coarray sync all failing


I'm learning to use Fortran's coarrays but I'm having a very strange behaviour that I don't understand. I run the following very simple loop

do i = 1, 100
    print*, this_image(), i
    sync all
    if (this_image() .eq. 1) print*
    sync all
end do

I expect the output to have each image print its index and then the value of the iterator, then wait for the first image to print a blank line, then continue, e.g.

1 1
2 1
4 1 
3 1

2 2
3 2 
4 2 
1 2

...

etcetera. But instead I got output

       1           1
       2           1
       4           1

       3           1
       3           2
       1           2

       1           3

       1           4
       2           2
       2           3
       2           4
       3           3
       3           4
       3           5
       4           2
       4           3
       4           4
       4           5
       4           6

       1           5

       ...

So it appears to me that the treads are not following the sync all commands. I do observe quite different behaviour if I remove the sync all however.

Am I misunderstanding what sync all does? I thought it was a global barrier that prevented any thread from continuing until all threads had completed their current tasks. Thus each thread should finish the first print, then the first thread must finish printing the new line, before any thread can move on to the second loop. Or am I wrong?


Solution

  • Sync all is not sufficient to ensure the ordering that you require. Quoting Metcalf, Reid, Cohen and Bader in "Modern Fortran Explained Incorporating Fortran 2023", section 17.20

    The default unit for output ( * in a write statement or output_unit in the intrinsic module iso_fortran_env ) and ... are preconnected on each image. The files to which these are connected are regarded as separate, but it is expected that a processor will merge their records into a single stream or a stream for output_unit and ... . Synchronization and flush statements might be sufficient to control the ordering of the records, but this is not guaranteed.

    This is essentially the same as for MPI - there the only ways to ensure ordered records are via MPI-IO or by a single process performing all the I/O.