fortranbinaryfilesintel-fortranfortran2003

Writing large Fortran binary files with access=stream


I am having some problems understanding the formatting of binary files that I am writing using Fortran. I use the following subroutine to write binary files to disk:

SUBROUTINE write_field(d,m,outfile)

    IMPLICIT NONE    
    REAL, INTENT(IN) :: d(m,m,m)
    INTEGER, INTENT(IN) :: m
    CHARACTER(len=256), INTENT(IN) :: outfile

    OPEN(7,file=outfile,form='unformatted',access='stream')
    WRITE(7) d
    CLOSE(7)

END SUBROUTINE write_field

My understanding of the access=stream option was that this would suppress the standard header and footer that comes with a Fortran binary (see Fortran unformatted file format).

If I write a file with m=512 then my expectation is that the file should be 4 x 512^3 bytes = 536870912 bytes ~ 513 Mb however they are in fact 8 bytes longer than this, coming in at 536870920 bytes. My guess is that these extra bytes are the 4 byte header and footers, which I had wanted to suppress by using access='stream'.

The situation becomes confusing to me if I write a file with m=1024 then my expectation is that the file should be 4 x 1024^3 bytes = 4294967296 ~ 4.1 Gb however they are in fact 24(!) bytes longer than this, coming in at 4294967320 bytes. I do not understand why there are 24 extra bytes here, which would seem to correspond to 6(!) headers or footers.

My questions are:

(a) Is it possible to get Fortran to write a binary with no headers or footers?

(b) If the answer to (a) is 'no' then can I ensure that the larger binary has the same header and footer structure as the smaller binary?

(c) If the answers to (a) and (b) are both 'no' then how do I understand where these extra headers and footers are in the file.

I am using ifort (version 14.0.2) and I am writing the binary files on a small Linux cluster.

UPDATE: When running the same code with OSx and compiled with gfortran 7.3.0 the binary files come out with the expected sizes, as in they are always 4 x m^3 bytes, even when m=1024. So this problem seems to be related to the older compiler.

UPDATE: In fact, the problem is only present when using ifort 14.0.2 I have updated the text to reflect this.


Solution

  • This problem is solved by adding status='replace' in the Fortran open command. It is not to do with the compiler.

    With access='stream' and without status='replace', the old binary file is not automatically replaced by the new binary file and is simply overwritten up to a certain point (https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/676047). This results in the old binary simply having bytes replaced up to the size of the new binary, while leaving any additional bytes, and the file size, unchanged. This is a problem if the new file size is smaller than the old file size. The problem difficult to diagnose because the time-stamp on the file is updated, so the file looks like it is new when queried using ls -l.

    A minimal working example that recreates this problem is as follows:

    PROGRAM write_binary_test_minimal
    
        IMPLICIT NONE
        REAL :: a
    
        a=1.
    
        OPEN(7,file='test',form='unformatted')
        WRITE(7) a
        CLOSE(7)
    
        OPEN(7,file='test',form='unformatted',access='stream')
        WRITE(7) a
        CLOSE(7)
    
    END PROGRAM write_binary_test_minimal
    

    The first write generates a file 'test' of size 8 + 4 = 12 bytes. Where the 8 is the standard Fortran-binary header and footer and the 4 is the size in bytes of a. In the second write statement, even though access='stream' has been set, only the first 4 bytes of the previously-generated 'test' are overwritten, leaving the file as size 12 bytes! The solution to this is to change the second write statement to

    OPEN(7,file='test',form='unformatted',access='stream',status='replace')
    

    with an explicit status='replace' to ensure the old file is replaced.