As interpolation-data (points and values) I have a 2-D 1x1 degree lat-,lon- grid, with a scalar-variable on each gridpoint. Large areas of the grid consists of NaNs (~40%). As coordinates (xi), I have an array of "unstructured" coordinates, all of which are inside of the grids bounds, and very few coordinates (<0.1%) are located near NaNs.
Usually, when linearly interpolating, I use scipy.interpolate.RegularGridInterpolator(points , values, method='linear')
when I interpolate gridded data to unstructured coordinates. Followed by a subsequent nearest neighbor interpolation for the remaining 0.1% where linear interpolation fails near NaNs.
fc = RegularGridInterpolator(points, values, method='cubic', bounds_error=True)
interped_cubic = fc(xi)
which raises "ValueError: Array must not contain infs or nans."
I have also tried interp2d
:
fc = interp2d(lon0, lat0, data, kind='cubic', bounds_error=True)
interped_cub = fc(lon, lat)
which returns interpolated values on a grid of shape (lon.size, lat.size), instead of interpolating onto the coordinates (lon, lat). (see Reproducible example for definition of lon/lat and lon0/lat0)
I have also tried removing the NaNs before interpolating:
from scipy.interpolate import interpn
import numpy.ma as ma
mask = np.isnan(data_nan)
lonG,latG = np.meshgrid(lon0,lat0)
lonM = ma.MaskedArray(lonG, mask).compressed()
latM = ma.MaskedArray(latG, mask).compressed()
interped_cub = interpn( (latM, lonM), data_nan[~mask], (lat, lon), method='cubic' )
which reduces the grid to a 1-D array with no NaNs. The interpolation returns ValueError: There are 2 point arrays, but values has 1 dimensions
.
I expected f=RegularGridInterpolator(points, values, method='cubic');f(xi)
to return interpolated values on coordinates in non-NaN-areas, and NaN-values on coordinates near or on NaN-areas - The same as what the function returns when using method='linear'
, only with a different interpolation-method.
Since I am obviously not using the function correctly, can someone explain where my mistake is, and hopefully suggest what I can try, to complete the cubic interpolation onto the coordinates?
import numpy as np, matplotlib.pyplot as plt
from scipy.interpolate import RegularGridInterpolator, interp2d
### THE FOLLOWING CODEBLOCK RUNS WITHOUT ERROR:
lat0, lon0 = np.linspace(-60, -40, 20), np.linspace(-50, 0.0, 50)
lonG, latG = np.meshgrid(lon0, lat0)
data = np.sin(latG*lonG) # Data has no NaN- areas
lat = np.random.uniform(low=-60, high=-40, size=2000)
lon = np.random.uniform(low=-50, high=0.0, size=2000)
plt.title('2D data, 1x1 grid');plt.pcolormesh(lon0, lat0, data);plt.show()
fl = RegularGridInterpolator((lat0, lon0), data, method='linear', bounds_error=True)
interped_lin = fl((lat, lon))
plt.title('linear interpolation');plt.scatter(lon, lat,c=interped_lin);plt.show()
fc = RegularGridInterpolator((lat0, lon0), data, method='cubic', bounds_error=True)
interped_cub = fc((lat, lon))
plt.title('cubic interpolation');plt.scatter(lon, lat,c=interped_cub);plt.show()
### THE FOLLOWING CODE RAISES ERROR
# Substitute some of the data with NaNs:
data_nan = data.copy()
data_nan[1:8, 5:12] = np.nan
data_nan[2:6, 12:20] = np.nan
plt.title('2D data w/ NaNs, 1x1 grid');plt.pcolormesh(lon0, lat0, data_nan);plt.show()
fl = RegularGridInterpolator((lat0, lon0), data_nan, method='linear', bounds_error=True)
interped_lin = fl((lat, lon))
plt.title('linear interpolation');plt.scatter(lon, lat,c=interped_lin);plt.show()
fc = RegularGridInterpolator((lat0, lon0), data_nan, method='cubic', bounds_error=True)
interped_cub = fc((lat, lon))
plt.title('cubic interpolation');plt.scatter(lon, lat,c=interped_cub);plt.show()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[3], line 27
25 plt.title('linear interpolation');plt.scatter(lon, lat,c=interped_lin);plt.show()
26 fc = RegularGridInterpolator((lat0, lon0), data_nan, method='cubic', bounds_error=True)
---> 27 interped_cub = fc((lat, lon))
28 plt.title('cubic interpolation');plt.scatter(lon, lat,c=interped_cub);plt.show()
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_rgi.py:335, in RegularGridInterpolator.__call__(self, xi, method)
333 if is_method_changed:
334 self._validate_grid_dimensions(self.grid, method)
--> 335 result = self._evaluate_spline(self.values.T, xi,
336 self._SPLINE_DEGREE_MAP[method])
338 if not self.bounds_error and self.fill_value is not None:
339 result[out_of_bounds] = self.fill_value
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_rgi.py:388, in RegularGridInterpolator._evaluate_spline(self, values, xi, spline_degree)
381 # Non-stationary procedure: difficult to vectorize this part entirely
382 # into numpy-level operations. Unfortunately this requires explicit
383 # looping over each point in xi.
384
385 # can at least vectorize the first pass across all points in the
386 # last variable of xi.
387 last_dim = n - 1
--> 388 first_values = self._do_spline_fit(self.grid[last_dim],
389 values,
390 xi[:, last_dim],
391 spline_degree)
393 # the rest of the dimensions have to be on a per point-in-xi basis
394 result = np.empty(m, dtype=self.values.dtype)
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_rgi.py:414, in RegularGridInterpolator._do_spline_fit(x, y, pt, k)
412 @staticmethod
413 def _do_spline_fit(x, y, pt, k):
--> 414 local_interp = make_interp_spline(x, y, k=k, axis=0)
415 values = local_interp(pt)
416 return values
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_bsplines.py:1240, in make_interp_spline(x, y, k, t, bc_type, axis, check_finite)
1237 axis = normalize_axis_index(axis, y.ndim)
1239 x = _as_float_array(x, check_finite)
-> 1240 y = _as_float_array(y, check_finite)
1242 y = np.moveaxis(y, axis, 0) # now internally interp axis is zero
1244 # sanity check the input
File /srv/conda/envs/notebook/lib/python3.10/site-packages/scipy/interpolate/_bsplines.py:36, in _as_float_array(x, check_finite)
34 x = x.astype(dtyp, copy=False)
35 if check_finite and not np.isfinite(x).all():
---> 36 raise ValueError("Array must not contain infs or nans.")
37 return x
ValueError: Array must not contain infs or nans.
Thanks! And let me know if I can improve my example.
Scipy's RegularGridInterpolator
expects the user to provide data for every point in a regular grid, so leaving out the NaN
points will result in an error. To interpolate over the NaN
s you can use an unstructured interpolator, such as scipy.interpolate.griddata
, which allows the user to pass only some of the points (i.e. you can remove the NaN
s). See some example code below.
import numpy as np
from scipy.interpolate import griddata
nx = 100
ny = 100
x = np.linspace(0, 10, nx)
y = np.linspace(0, 10, ny)
X, Y = np.meshgrid(x, y)
rng = np.random.default_rng(42)
n_nan = 1000
non_nan_indices = rng.choice(np.arange(nx*ny), nx*ny - n_nan)
X_nan = X.flatten()[non_nan_indices]
Y_nan = Y.flatten()[non_nan_indices]
Z_nan = np.cos(X_nan)*np.sin(Y_nan)
Z_interp = griddata((X_nan, Y_nan), Z_nan, (X.flatten(), Y.flatten()), method="cubic")
Z_interp
can be reshaped to (nx, ny)
.