python-xarraypython-s3fs

xarray I/O operation on closed file


I am opening and using a netcdf file that is located on s3. I have the following code, however it creates an exception.

import s3fs
import xarray as xr

filepath = "s3://mybucket/myfile.nc"
fs = s3fs.S3FileSystem()

with fs.open(filepath) as infile:
    print("opening")
    ds = xr.open_dataset(infile, engine="h5netcdf")
    print(ds)
    print("done")

with fs.open(filepath) as infile:
    print("opening")
    ds = xr.open_dataset(infile, engine="h5netcdf")
    print(ds)
    print("done")

On the second ds = xr.open_dataset(infile, engine="h5netcdf") I get an exception: "I/O operation on closed file."

Why?

I found that by putting a ds.close() in between the two sections, it's ok. So that implies that even though infile was closed when the with block ended, ds still had it locked for exclusive use?

However, additionally, in between the two blocks, I tried print(ds["variable_name"].values) and also got the "operation on closed file" exception, which isn't surprising as the file is closed and the data was lazily loaded, but again it raises the question of why the second attempt to open_dataset fails.


Solution

  • the netcdf library places a lock on file objects opened for reading, and this can include fs objects. Opening the netCDFs in a context manager, or as you point out, explicitly closing the file objects, will resolve the issue:

    filepath = "s3://mybucket/myfile.nc"
    fs = s3fs.S3FileSystem()
    
    with fs.open(filepath) as infile:
        print("opening")
        with xr.open_dataset(infile, engine="h5netcdf") as ds:
            print(ds)
            print("done")
    
    with fs.open(filepath) as infile:
        print("opening")
        with xr.open_dataset(infile, engine="h5netcdf") as ds:
            print(ds)
            print("done")