I am applying slicing and aggregation operations over Netcdf files in Python language. One of the solutions for working with this kind of file is to use the Xarray library.
I am still new to the library functionalities, so I would like to know whether Xarray objects possess some method to check if a sliced DataSet/DataArray is empty or not, just like Pandas has (in the case of pandas, one can check if the dataframe/series is empty through the 'empty' method).
The only solution I found was to always convert the Xarray Dataset/DataArray into a pandas Dataframe/Series, to then check if it is empty or not.
Here is code snippet as example:
import xarray as xr
path = 'my_path_to_my_netcdf_file.nc'
Xarray_DataArray = xr.open_dataset(path)
print(Xarray_DataArray)
# this returns something like:
# Dimensions: (lat: 600, lon: 672, time: 37)
# Coordinates:
# * lat (lat) float32 -3.9791672 -3.9375012 ... 20.9375 20.979166
# * lon (lon) float32 -60.979168 -60.9375 ... -33.0625 -33.020832
# * time (time) datetime64[ns] 2010-05-19 2010-05-20 ... 2010-06-24
# Data variables:
# variable_name (time, lat, lon) float32 dask.array<shape=(37, 600, 672),
# chunksize=(37, 600, 672)>
# I normally use the 'sel' method to slice the xarray object, like below:
Sliced_Xarray_DataArray = Xarray_DataArray.sel({'lat':slice(-10, -9),
'lon':slice(-170, -169)
})
# but since, Xarray does not possess a proper way to check the slice, I usually have to do the following:
if Sliced_Xarray_DataArray.to_dataframe().empty():
print('is empty. Nothing to aggregate')
else:
Aggregated_value = Aggregation_function(Sliced_Xarray_DataArray)
print('continuing with the analysis')
# ... continue
I would appreciate any suggestions.
I thank you for your time, and I hope hearing from you soon.
Sincerely yours,
Philipe R. Leal
The accepted answer does not work if the Dataset does not have any dimensions, i.e., if the dataset is truly empty.
A better solution is this:
import numpy as np
def dataset_is_empty(input_dataset):
"""Test if an input xarray.Dataset is empty."""
n_dims = len(input_dataset.dims)
if n_dims == 0:
empty = True
else:
dim_lengths = np.zeros(n_dims)
for cnt, dim in enumerate(input_dataset.dims):
dim_lengths[cnt] = len(input_dataset[dim])
if (dim_lengths == 0).all():
empty = True
else:
empty = False
return empty