pythonpython-xarray

multidimensional coordinate transform with xarray


How to convert multidimensional coordinate to standard coordinate in order to unify data when using xarray for nc data:

import xarray as xr

da = xr.DataArray(
    [[0, 1], [2, 3]],
    coords={
        "lon": (["ny", "nx"], [[30, 40], [40, 50]]),
        "lat": (["ny", "nx"], [[10, 10], [20, 20]]),
    },
    dims=["ny", "nx"],
)

Expected conversion result:

xr.DataArray(
    [[0, 1, np.nan], 
     [np.nan, 2, 3]],
    coords={
        "lat": [10, 20],
        "lon": [30, 40, 50],
        })

Solution

  • You can flatten the data into a list of points using xarray.DataArray.stack, extract unique coordinates and reassign values onto a regular grid using the unique coordinate values.

    import xarray as xr
    import numpy as np
    
    da = xr.DataArray(
        [[0, 1], [2, 3]],
        coords={
            "lon": (["ny", "nx"], [[30, 40], [40, 50]]),
            "lat": (["ny", "nx"], [[10, 10], [20, 20]]),
        },
        dims=["ny", "nx"],
    )
    
    # Flatten
    flat = da.stack(z=("ny", "nx"))
    
    # Extract unique
    lat_vals = np.unique(da.lat.values)
    lon_vals = np.unique(da.lon.values)
    
    new_da = xr.DataArray(
        np.full((len(lat_vals), len(lon_vals)), np.nan),
        coords={"lat": lat_vals, "lon": lon_vals},
        dims=["lat", "lon"]
    )
    
    # Reassign values onto regular grid
    for i in range(flat.size):
        lat_i = float(flat.lat.values[i])
        lon_i = float(flat.lon.values[i])
        val = flat.values[i]
        new_da.loc[dict(lat=lat_i, lon=lon_i)] = val
    
    print(new_da)
    

    Output:

    <xarray.DataArray (lat: 2, lon: 3)> Size: 48B
    array([[ 0.,  1., nan],
           [nan,  2.,  3.]])
    Coordinates:
      * lat      (lat) int64 16B 10 20
      * lon      (lon) int64 24B 30 40 50