netcdfthreddsncml

NcML aggregation of remote THREDDS catalog


I want to aggregate all files within a specific directory of a remote THREDDS catalog. These are grib2 files for nam forecast. This is the main list of directories for each month. Here is my ncml file for the aggregation of this catalog of files:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" >
    <aggregation dimName="time" type="joinExisting">
    <scan location="http://www.ncei.noaa.gov/thredds/dodsC/nam218/201807/20180723/" regExp="^.*\.grb2$" subdirs="false"/>
    <dimension name="time" orgName="t" />
    </aggregation>
</netcdf>

Also, I am mostly interested in having these two variables in the files: u-component_of_wind_height_above_ground and v-component_of_wind_height_above_ground.

I am not sure the above aggregation is correct from the remote catalog. I get this error from the above ncml file:

There are no datasets in the aggregation DatasetCollectionManager{ collectionName='http://www.ncei.noaa.gov/thredds/dodsC/nam218/201807/20180723/^.*\.grb2$' recheck=null dir=http://www.ncei.noaa.gov/thredds/dodsC/nam218/201807/20180723/ filter=^.*\.grb2$

How this ncml file should be written?

Thanks.


Solution

  • You cannot glob remote URLs so you will need to provide a list of these OPeNDAP endpoints to the aggregation, like:

    <dataset name="Nam218" urlPath="nam218">
      <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
        <aggregation dimName="time" type="joinExisting">
          <netcdf location="http://www.ncei.noaa.gov/thredds/dodsC/nam218/201807/20180723/<file01>.grb2"/>
          <netcdf location="http://www.ncei.noaa.gov/thredds/dodsC/nam218/201807/20180723/<file02>.grb2"/>
          <netcdf location="http://www.ncei.noaa.gov/thredds/dodsC/nam218/201807/20180723/<file03>.grb2"/>
        </aggregation>
      </netcdf>
    </dataset>