netcdfnetcdf4cdo-climatencoera5

Troubleshooting nccopy: getting stuck at the end of rechunking


I am working with ERA5 atmospheric data to calculate wind speeds anywhere globally, at maximum spatial and time resolution. This results in an uncompressed 70GB file for one year's worth of data, and I want to analyze 5 years in total, so it could quickly scale up.

In my application, I want to retrieve the entire time history data for a single spatial coordinate, which for one-year results in a single [1x8760] array (from a total array of [721x1440x8760]). However, reading this data with xarray takes approximately 12s. This is too much time given that I want to make an API in which the user inputs the coordinates directly, and I still have to include 5 more years.

This is why I have tried rechunking with nccopy, but it is not working very well. I first did a test with a month's worth of data obtained directly in .nc format from the ERA5 site, and the rechunking was as fast as moving the file out of disk, with a reading speed 10x slower. However, when I tried doing the same with an .nc file that was previously obtained from a .grib file (transformed using cdo's copy command), the rechunking is equally as fast... until it gets hung when the two files are the same size. From then, it slowly increases in size and then it just seems to freeze. The command line in the terminal does not provide any information whatsoever: it simply does not finish the process.

The command that I ran is the following:

nccopy -k 'enhanced' -c 'time/8760,lat/7,lon/14' 2023Data.nc 2023ch.nc

I have tried it for both monthly and yearly data, and in both cases its gets stuck. The only time it did not get stuck was when I tested two days worth of data. In this case, the program took a little bit after it reached its final size to close the subprocess. I was wondering if the same was the case for the other two, so I left one running overnight, but the memory usage is very low and when I came back to check on it it was the same.

Similarly, I have also tried to use ncrcat, but it builds the file at speeds of around 200 kB/s, which is not very scalable. Perhaps it could do with my computer's RAM capacity? It is only 8GB. Still, running for long times is not really an issue, as long as I know that it will result in a faster-to-read file.

In case you are curious, my ncrcat command is the following:

ncrcat -4 -L 1 --cnk_csh=3000000000 --cnk_plc=g3d --cnk_dmn=time,8760 --cnk_dmn=lat,7 --cnk_dmn=lon,14 2023Data.nc 2023Data_chunk.nc

Thanks in advance!

EDIT:

I have performed an ncdump on the month's data that succeeded and the one that did not. The things that jump out at me the most are the presence of the UNLIMITED attribute for time in the unsuccessful version, and the presence of scale and offset factors in the successful one. Could this have anything to do with it? Here is their data:

SUCCESSFUL VERSION:

netcdf March23 {
dimensions:
    longitude = 1440 ;
    latitude = 721 ;
    time = 744 ;
variables:
    float longitude(longitude) ;
        longitude:units = "degrees_east" ;
        longitude:long_name = "longitude" ;
    float latitude(latitude) ;
        latitude:units = "degrees_north" ;
        latitude:long_name = "latitude" ;
    int time(time) ;
        time:units = "hours since 1900-01-01 00:00:00.0" ;
        time:long_name = "time" ;
        time:calendar = "gregorian" ;
    short u(time, latitude, longitude) ;
        u:scale_factor = 0.000973234033583867 ;
        u:add_offset = 0.723214860033989 ;
        u:_FillValue = -32767s ;
        u:missing_value = -32767s ;
        u:units = "m s**-1" ;
        u:long_name = "U component of wind" ;
        u:standard_name = "eastward_wind" ;
    short v(time, latitude, longitude) ;
        v:scale_factor = 0.000975952222946717 ;
        v:add_offset = -2.73403137210757 ;
        v:_FillValue = -32767s ;
        v:missing_value = -32767s ;
        v:units = "m s**-1" ;
        v:long_name = "V component of wind" ;
        v:standard_name = "northward_wind" ;

// global attributes:
        :Conventions = "CF-1.6" ;
        :history = "2024-04-16 08:19:28 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data4/adaptor.mars.internal-1713255545.5304747-14711-19-427e9bce-89c5-41b9-84eb-7977830332cc.nc /cache/tmp/427e9bce-89c5-41b9-84eb-7977830332cc-adaptor.mars.internal-1713255478.0202947-14711-24-tmp.grib" ;
        :_Format = "64-bit offset" ;

UNSUCCESSFUL VERSION:

netcdf Marchgrib {
dimensions:
    time = UNLIMITED ; // (744 currently)
    lon = 1440 ;
    lat = 721 ;
    plev = 1 ;
variables:
    double time(time) ;
        time:standard_name = "time" ;
        time:units = "hours since 2023-3-1 00:00:00" ;
        time:calendar = "proleptic_gregorian" ;
        time:axis = "T" ;
    float lon(lon) ;
        lon:standard_name = "longitude" ;
        lon:long_name = "longitude" ;
        lon:units = "degrees_east" ;
        lon:axis = "X" ;
    float lat(lat) ;
        lat:standard_name = "latitude" ;
        lat:long_name = "latitude" ;
        lat:units = "degrees_north" ;
        lat:axis = "Y" ;
    double plev(plev) ;
        plev:standard_name = "air_pressure" ;
        plev:positive = "down" ;
        plev:axis = "Z" ;
    float u(time, plev, lat, lon) ;
        u:standard_name = "eastward_wind" ;
        u:long_name = "U component of wind" ;
        u:units = "m s**-1" ;
        u:code = 131 ;
        u:table = 128 ;
    float v(time, plev, lat, lon) ;
        v:standard_name = "northward_wind" ;
        v:long_name = "V component of wind" ;
        v:units = "m s**-1" ;
        v:code = 132 ;
        v:table = 128 ;

// global attributes:
        :CDI = "Climate Data Interface version ?? (http://mpimet.mpg.de/cdi)" ;
        :Conventions = "CF-1.6" ;
        :history = "Thu Apr 18 16:50:17 2024: cdo -f nc copy March23.grib Marchgrib.nc" ;
        :institution = "European Centre for Medium-Range Weather Forecasts" ;
        :CDO = "Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)" ;
        :_Format = "64-bit offset" ;
}


Solution

  • I was able to fix the issue by changing the method of transforming GRIB to NETCDF4 format. Previously, I was using cdo and its copy command. This not only resulted in double the file size, but rechunking it was impossible (the reasons I do not know).

    Instead, I used the function grib_to_netcdf provided by ecCodes. This fixed the issue altogether and it is possible to use nccopy to rechunk it for my purposes, which resulted in a speed improvement of 40x.