pythonfortranruntime-errorbinaryfilesallocatable-array

How to solve 'Fortran runtime error: I/O past end of record on unformatted file'?


Now I have one 1024 * 1024 * 1024 array, whose dtype is float32. Firstly I save this array to one file in the format of '.bigfile'. And then I convert this bigfile to the Fortran unformatted file by running the code as below.

with bigfile.File('filename.bigfile') as bf:
    shape = bf['Field'].attrs['ndarray.shape']
    data = bf['Field'][:].reshape(shape)
    np.asfortranarray(data).tofile('filename.dat')

Next for testing this binary file i.e. 'filename.dat', I read this file by Python and Fortran95 respectively. The Python code runs fine and the code snippet is shown below.

field = np.fromfile('filename.dat', 
                   dtype='float32', count=1024*1024*1024)
density_field = field.reshape(1024, 1024, 1024)

However, Fortran runtime error occurred when I run Fortran reading code:

 Program readout00
  Implicit None
  Integer, Parameter :: Ng = 1024
  Real, Allocatable, Dimension(:,:,:) :: dens
  Integer :: istat, ix, iy, iz
  ! -------------------------------------------------------------------------
  ! Allocate the arrays for the original simulation data
  ! -------------------------------------------------------------------------
  Allocate(dens(0:Ng-1, 0:Ng-1, 0:Ng-1), STAT=istat)
  If( istat/=0 ) Stop "Wrong Allocation-1"
  ! -------------------------------------------------------------------------
  Open(10, file="filename.dat", status="old", form="unformatted")
  Read(10) dens
  Close(10)
  Write(*,*) "read-in finished"
  ! -------------------------------------------------------------------------
  Do ix = 0, 1
    Do iy = 0, 1
      Do iz = 0, 1
        Write(*,*) "ix, iy, iz, rho=", ix, iy, iz, dens(ix, iy, iz)
     EndDo
    EndDo
  EndDo
  !--------------------------------------------------------------------------
End Program readout00

The error message:

At line 13 of file readout00.f90 (unit = 10, file = 'filename.dat')
Fortran runtime error: I/O past end of record on unformatted file



Error termination. Backtrace:
#0  0x7f7d8aff8e3a
#1  0x7f7d8aff9985
#2  0x7f7d8affa13c
#3  0x7f7d8b0c96e0
#4  0x7f7d8b0c59a6
#5  0x400d24
#6  0x400fe1
#7  0x7f7d8a4db730
#8  0x400a58
#9  0xffffffffffffffff

I don't understand why those errors appear.

Note: the overall operation is processed in the LINUX remote server.



After repeatedly modified the read statement, I found that the Fortran code ran fine if ix<=632, iy<=632, iz<=632. If they are greater than 632, runtime error will appear. How should I do to correct this error so that dens can read all 1024^3 elements?

Read(10) (((dens(ix, iy, iz), ix=0,632), iy=0,632), iz=0,632)


Supplementary:

Today I added one clause acccess=stream in the open statement, and read(10) header before the read(10) dens, i.e.

Integer :: header
......
Open(10, file="filename.dat", status="old",    &
         form="unformatted", access='stream')
Read(10) header
Read(10) dens

After modification, the Fortran code 'readout00.f95' read in 1024 * 1024 * 1024 array, i.e. dens successfully.

Why does the original 'readout00.f95' fail to read in dens?


Solution

  • @IanH has correctly answered your question in the comments, or more precisely pointed to the correct answer in a different question.

    The 'unformatted' format just means that the file is not interpreted as human-readable, but the data in the file needs to be laid out in a specific way. While the specific format is not certain, and compiler- and system-dependent, usually each record has its own header and footer that displays the length of the data.

    The numpy.asfortanarray does not impact the file layout at all, it only ensures that the layout of the array in memory is the same as Fortran (Column-Major, or first index changing most quickly), as opposed to the usual (Row-Major, or last index changing most quickly).

    See this example:

    I created the same data (type int16, values 0 through 11) in python and fortran, and stored it in two files, the python version with np.asfortranarray.tofile and in Fortran with unformatted write. These are the results:

    With Python:

    0000000 00 00 01 00 02 00 03 00 04 00 05 00 06 00 07 00
    0000010 08 00 09 00 0a 00 0b 00
    

    With Fortran:

    0000000 18 00 00 00 00 00 01 00 02 00 03 00 04 00 05 00
    0000010 06 00 07 00 08 00 09 00 0a 00 0b 00 18 00 00 00
    

    In the python file, the 'data' starts immediately (00 00 for 0, then 01 00 for 1, and so forth, until 0b 00 for 11), but in Fortran, there's a 4-byte header: 18 00 00 00, or 24, which is the number of bytes of data, and this value is then repeated at the end.

    When you try to read a file with Fortran using form='unformatted', that is the kind of data that the program expects to find, but that's not the data you have.

    The solution is exactly what you have done: Use a stream. In a stream, the program expects the data to come in continuously without any headers or metadata.