When selecting data from an xarray.Dataset type, the examples they provide all include hardcoding the name of the dimension like so:
ds = ds.sel(state_name='California')
TLDR; How can you select from a dataset without hardcoding the dimension name? How would I achieve something like this since the below doesn't work?
dimName = 'state_name'
ds = ds.sel(dimName='California')
I have a situation where I won't know the name of the dimension to make my selection on until runtime of the application, but I can't figure out how to select the data with xarray's methods unless I know the dimension name ahead of time. For instance, let's say I have a dataset like this, where dim2
, dim3
, and dim4
all correspond to ID numbers of different spatial bounds that a user could select on a map:
import xarray as xr
import numpy as np
dim2 = ['12', '34', '56', '78']
dim3 = ['121', '341', '561', '781']
dim4 = ['1211', '3411', '5611', '7811']
time_mn = np.arange(1, 61)
ds1 = xr.Dataset(
data_vars={
'prcp_dim2': (['dim2', 'time_mn'], np.random.rand(len(dim2), len(time_mn))),
'prcp_dim3': (['dim3', 'time_mn'], np.random.rand(len(dim3), len(time_mn))),
'prcp_dim4': (['dim4', 'time_mn'], np.random.rand(len(dim4), len(time_mn))),
},
coords={
'dim2': (['dim2'], dim2),
'dim3': (['dim3'], dim3),
'dim4': (['dim4'], dim4),
'time_mn': (['time_mn'], time_mn)
}
)
print(ds1)
<xarray.Dataset> Size: 6kB
Dimensions: (dim2: 4, time_mn: 60, dim3: 4, dim4: 4)
Coordinates:
* dim2 (dim2) <U2 32B '12' '34' '56' '78'
* dim3 (dim3) <U3 48B '121' '341' '561' '781'
* dim4 (dim4) <U4 64B '1211' '3411' '5611' '7811'
* time_mn (time_mn) int64 480B 1 2 3 4 5 6 7 8 ... 53 54 55 56 57 58 59 60
Data variables:
prcp_dim2 (dim2, time_mn) float64 2kB 0.8804 0.2733 ... 0.3227 0.4637
prcp_dim3 (dim3, time_mn) float64 2kB 0.1391 0.4541 ... 0.1688 0.3271
prcp_dim4 (dim4, time_mn) float64 2kB 0.4784 0.6666 ... 0.3619 0.4864
Now let's say a a map is presented to a user and the user chooses ID 78
to calculate something from the dataset. From this ID, I can glean the dimension value 78
belongs to is dim2
. How would I then make a selection on the xarray dataset where dim2=78
without hardcoding dim2
in?
selectedID = request.get('id') #This is the user's choice, let's say they chose '78'.
#Get the dimension name the selectedID belongs to
if len(selectedID) == 2:
selectedDimension = 'dim2'
elif len(selectedID) == 3:
selectedDimension = 'dim3'
elif len(selectedID) == 4:
selectedDimension = 'dim4'
#This is what I want to be able to do, but it does not work
ds = ds.sel(selectedDimension=selectedID)
Is there a way to select the data without hardcoding the dimension name?
Edit: I do realize there is a solution like this, but that falls apart if say I wanted to put the above version of the if/else in a callable function because I could be reusing it elsewhere and I don't necessarily want to select the data when I call the function.
if len(selectedID) == 2:
ds = ds.sel(dim2=selectedID)
elif len(selectedID) == 3:
ds = ds.sel(dim3=selectedID)
elif len(selectedID) == 4:
ds = ds.sel(dim4=selectedID)
This is a nice place to use Python dictionary unpacking.
To get this:
res = ds.sel(state_name='California')
You can:
dim_sel = {'state_name': 'California'}
res = ds.sel(**dim_sel)
And of course directly:
res = ds(**{'state_name': 'California'}}
Unpacking the dictionary with **
spreads the keys as argument names and the values as the argument values. This solution works anywhere in Python where you need to pass named arguments, it's not specific to xarray
.
Since you can just construct dictionaries on the fly with strings as key values, you are no longer stuck with using identifiers as parameter names.
Your example where you select a dimension based on the length of some value would work out to:
dim_lookup = {
2: 'dim2',
3: 'dim3',
4: 'dim4'
}
res = ds.sel(**{dim_lookup[len(some_value)]: some_value})
Note that this assumes there will be a key for every possible length of selectedID
, but I'm sure you can see how to make this more robust. Also note that I assign to res
instead of ds
because I'm not sure you actually want to overwrite the original xarray
reference with your selection.