I am having some problems understanding the formatting of binary files that I am writing using Fortran. I use the following subroutine to write binary files to disk:
SUBROUTINE write_field(d,m,outfile)
IMPLICIT NONE
REAL, INTENT(IN) :: d(m,m,m)
INTEGER, INTENT(IN) :: m
CHARACTER(len=256), INTENT(IN) :: outfile
OPEN(7,file=outfile,form='unformatted',access='stream')
WRITE(7) d
CLOSE(7)
END SUBROUTINE write_field
My understanding of the access=stream
option was that this would suppress the standard header and footer that comes with a Fortran binary (see Fortran unformatted file format).
If I write a file with m=512
then my expectation is that the file should be 4 x 512^3 bytes = 536870912 bytes ~ 513 Mb
however they are in fact 8 bytes longer than this, coming in at 536870920 bytes
. My guess is that these extra bytes are the 4 byte header and footers, which I had wanted to suppress by using access='stream'
.
The situation becomes confusing to me if I write a file with m=1024
then my expectation is that the file should be 4 x 1024^3 bytes = 4294967296 ~ 4.1 Gb
however they are in fact 24(!) bytes longer than this, coming in at 4294967320 bytes
. I do not understand why there are 24 extra bytes here, which would seem to correspond to 6(!) headers or footers.
My questions are:
(a) Is it possible to get Fortran to write a binary with no headers or footers?
(b) If the answer to (a) is 'no' then can I ensure that the larger binary has the same header and footer structure as the smaller binary?
(c) If the answers to (a) and (b) are both 'no' then how do I understand where these extra headers and footers are in the file.
I am using ifort
(version 14.0.2) and I am writing the binary files on a small Linux cluster.
UPDATE: When running the same code with OSx
and compiled with gfortran
7.3.0 the binary files come out with the expected sizes, as in they are always 4 x m^3 bytes
, even when m=1024
. So this problem seems to be related to the older compiler.
UPDATE: In fact, the problem is only present when using ifort
14.0.2 I have updated the text to reflect this.
This problem is solved by adding status='replace'
in the Fortran open
command. It is not to do with the compiler.
With access='stream'
and without status='replace'
, the old binary file is not automatically replaced by the new binary file and is simply overwritten up to a certain point (https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/676047). This results in the old binary simply having bytes replaced up to the size of the new binary, while leaving any additional bytes, and the file size, unchanged. This is a problem if the new file size is smaller than the old file size. The problem difficult to diagnose because the time-stamp on the file is updated, so the file looks like it is new when queried using ls -l
.
A minimal working example that recreates this problem is as follows:
PROGRAM write_binary_test_minimal
IMPLICIT NONE
REAL :: a
a=1.
OPEN(7,file='test',form='unformatted')
WRITE(7) a
CLOSE(7)
OPEN(7,file='test',form='unformatted',access='stream')
WRITE(7) a
CLOSE(7)
END PROGRAM write_binary_test_minimal
The first write
generates a file 'test' of size 8 + 4 = 12
bytes. Where the 8
is the standard Fortran-binary header and footer and the 4
is the size in bytes of a
. In the second write
statement, even though access='stream'
has been set, only the first 4
bytes of the previously-generated 'test' are overwritten, leaving the file as size 12
bytes! The solution to this is to change the second write statement to
OPEN(7,file='test',form='unformatted',access='stream',status='replace')
with an explicit status='replace'
to ensure the old file is replaced.