netcdfncocdo-climate

How to write null values in a netcdf file?


Does _FillValue or missing_value still occupy storage space?

If there is a 2-dimensional array with some null values, How can I write it to netcdf file for saving storage space?


Solution

  • In netCDF3 every value requires the same amount of disk space. In netCDF4 it is possible to reduce the required disk space using gzip compression. The actual compression ratio depends on the data. If there are lots of identical values (for example missing data), you can achieve good results. Here is an example in python:

    import netCDF4
    import numpy as np
    import os
    
    # Define sample data with all elements masked out
    N = 1000
    data = np.ma.masked_all((N, N))
    
    # Write data to netCDF file using different data formats
    for fmt in ('NETCDF3_CLASSIC', 'NETCDF4'):
        fname = 'test.nc'
        ds = netCDF4.Dataset(fname, format=fmt, mode='w')
        xdim = ds.createDimension(dimname='x', size=N)
        ydim = ds.createDimension(dimname='y', size=N)
        var = ds.createVariable(
            varname='data',
            dimensions=(ydim.name, xdim.name),
            fill_value=-999,
            datatype='f4',
            complevel=9,  # set gzip compression level
            zlib=True  # enable compression
        )
        var[:] = data
        ds.close()
    
        # Determine file size
        print fmt, os.stat(fname).st_size
    

    See the netCDF4-python documentation, section 9) "Efficient compression of netCDF variables" for details.