Short version:
Is it possible to create a new scipy.spatial.Delaunay object with a subset of the triangles (2D data) from an existing object?
The goal would be to use the find_simplex method on the new object with filtered out simplices.
Similar but not quite the same
matplotlib contour/contourf of **concave** non-gridded data
Long version:
I am looking at lat-lon data that I regrid with scipy.interpolate.griddata like in the pseudo-code below:
import numpy as np
from scipy.interpolate import griddata
from scipy.spatial import Delaunay
from scipy.interpolate.interpnd import _ndim_coords_from_arrays
#lat shape (a,b): 2D array of latitude values
#lon shape (a,b): 2D array of longitude values
#x shape (a,b): 2D array of variable of interest at lat and lon
# lat-lon data
nonan = ~np.isnan(lat)
flat_lat = lat[nonan]
flat_lon = lon[nonan]
flat_x = x[nonan]
# regular lat-lon grid for regridding
lon_ar = np.arange(loni,lonf,resolution)
lat_ar = np.arange(lati,latf,resolution)
lon_grid, lat_grid = np.meshgrid(lon_ar,lat_ar)
# regrid
x_grid = griddata((flat_lon,flat_lat),flat_x,(lon_grid,lat_grid), method='nearest')
# filter out extrapolated values
cloud_points = _ndim_coords_from_arrays((flat_lon,flat_lat))
regrid_points = _ndim_coords_from_arrays((lon_grid.ravel(),lat_grid.ravel()))
tri = Delaunay(cloud_points)
outside_hull = tri.find_simplex(regrid_points) < 0
x_grid[outside_hull.reshape(x_grid.shape)] = np.nan
# filter out large triangles ??
# it would be easy if I could "subset" tri into a new scipy.spatial.Delaunay object
# new_tri = ??
# outside_hull = new_tri.find_simplex(regrid_points) < 0
The problem is that the convex hull has low quality (very large, shown in blue in example below) triangles that I would like to filter out as they don't represent the data well. I know how to filter them out in input points, but not in the regridded output. Here is the filter function:
def filter_large_triangles(
points: np.ndarray, tri: Optional[Delaunay] = None, coeff: float = 2.0
):
"""
Filter out triangles that have an edge > coeff * median(edge)
Inputs:
tri: scipy.spatial.Delaunay object
coeff: triangles with an edge > coeff * median(edge) will be filtered out
Outputs:
valid_slice: boolean array that selects "normal" triangles
"""
if tri is None:
tri = Delaunay(points)
edge_lengths = np.zeros(tri.vertices.shape)
seen = {}
# loop over triangles
for i, vertex in enumerate(tri.vertices):
# loop over edges
for j in range(3):
id0 = vertex[j]
id1 = vertex[(j + 1) % 3]
# avoid calculating twice for non-border edges
if (id0,id1) in seen:
edge_lengths[i, j] = seen[(id0,id1)]
else:
edge_lengths[i, j] = np.linalg.norm(points[id1] - points[id0])
seen[(id0,id1)] = edge_lengths[i, j]
median_edge = np.median(edge_lengths.flatten())
valid_slice = np.all(edge_lengths < coeff * median_edge, axis=1)
return valid_slice
The bad triangles are shown in blue below:
import matplotlib.pyplot as plt
no_large_triangles = filter_large_triangles(cloud_points,tri)
fig,ax = plt.subplot()
ax.triplot(points[:,0],points[:,1],tri.simplices,c='blue')
ax.triplot(points[:,0],points[:,1],tri.simplices[no_large_triangles],c='green')
plt.show()
Is it possible to create a new scipy.spatial.Delaunay object with only the no_large_triangles
simplices? The goal would be to use the find_simplex method on that new object to easily filter out points.
As an alternative how could I find the indices of points in regrid_points
that fall inside the blue triangles? (tri.simplices[~no_large_triangles]
)
So it is possible to modify the Delaunay object for the purpose of using find_simplex on a subset of simplices, but it seems only with the bruteforce algorithm.
# filter out extrapolated values
cloud_points = _ndim_coords_from_arrays((flat_lon,flat_lat))
regrid_points = _ndim_coords_from_arrays((lon_grid.ravel(),lat_grid.ravel()))
tri = Delaunay(cloud_points)
outside_hull = tri.find_simplex(regrid_points) < 0
# filter out large triangles
large_triangles = ~filter_large_triangles(cloud_points,tri)
large_triangle_ids = np.where(large_triangles)[0]
subset_tri = tri # this doesn't preserve tri, effectively just a renaming
# the _find_simplex_bruteforce method only needs the simplices and neighbors
subset_tri.nsimplex = large_triangle_ids.size
subset_tri.simplices = tri.simplices[large_triangles]
subset_tri.neighbors = tri.neighbors[large_triangles]
# update neighbors
for i,triangle in enumerate(subset_tri.neighbors):
for j,neighbor_id in enumerate(triangle):
if neighbor_id in large_triangle_ids:
# reindex the neighbors to match the size of the subset
subset_tri.neighbors[i,j] = np.where(large_triangle_ids==neighbor_id)[0]
elif neighbor_id>=0 and (neighbor_id not in large_triangle_ids):
# that neighbor was a "normal" triangle that should not exist in the subset
subset_tri.neighbors[i,j] = -1
inside_large_triangles = subset_tri.find_simplex(regrid_points,bruteforce=True) >= 0
invalid_slice = np.logical_or(outside_hull,inside_large_triangles)
x_grid[invalid_slice.reshape(x_grid.shape)] = np.nan
Showing that the new Delaunay object has only the subset of large triangles
import matplotlib.pyplot as plt
fig,ax = plt.subplot()
ax.triplot(cloud_points[:,0],cloud_points[:,1],subset_tri.simplices,color='red')
plt.show()
Plotting x_grid with pcolormesh before the filtering for large triangles (zoomed in the blue circle above):
After the filtering: