I have a 2d array representing height. It contains peaks, and then a linear gradient in between the peaks.
I am generating the array myself, based on the positions of the peaks. The gradient is simply the distance to the nearest peak. So the resolution of the actual array is higher than that of the posted images.
(Originally I only posted white/blue images, I've left the rest of them as is)
Original image (visualisation of the array):
I would like to find the valleys and their intersections, as a set of line segments, visualised as below.
I'm aware of various approaches to finding the minima, but no way to explicitly figuring out the exact intersections and the line segments in between them.
I've been working on this for a large number of hours but haven't got anywhere. I've tried things like below (arr is the array):
minima = (arr != scipy.ndimage.minimum_filter(arr, size=10, mode='constant', cval=0.0))
modified_arr = np.where(minima, arr, 1500)
and I've also tried iterating explicitly over rows/columns separately, with similar code to above. Both of these approaches are 'spotty', they identify a lot of the valleys when plotted, but they're not fully connected line segments.
Crucially, what I haven't worked out how to do is actually define where the intersections are and the line segments in between them formed by the valleys. Can anyone help point me in the right direction?
I'd like to also know the actual depth value of the intersection points as well if possible.
Mathematically, I know that the intersections are defined by points which are an equal distance from three peaks, and the valleys are defined by points which are an equal distance from two peaks.
TL;DR - I want to find the intersections/valleys as a set of line segments, can anyone help?
Update
I've got part of the way. It's a bit rough, but the below does find all the intersections and valleys, except there are extra points found. This is because the only way I found which would consistently discover all the intersections/valleys was to allow a margin. Also, there's an extra intersection and valley in the top left, I forgot to indicate those in the original picture, there is actually a second peak directly beside another one.`
I had to play around with the 'magic numbers' at the top to find a happy point where all the intersections/valleys are found but not too many points are included.
My current idea for moving forwards is to apply K-means to the intersections to find the centroids of those points. For the valleys I'm not so sure, possibly look at defining the angles that nearby valley points are at in relation to intersections, and then finding which intersection lies on the same angle. I also thought of limiting the discovered valley points to those within a certain distance of an intersection, which would possibly make K-means appropriate for those as well.
If anyone has anything to add I'd appreciate it. Otherwise I'll post further results when I get there.
max_int_diff = 0.85 # max intersection difference
max_valley_diff = 0.45 # max valley difference
max_edge_dist = 2 # max distance from edge of graph
intersections = []
valleys = []
for i in range(arr.shape[1]):
for j in range(arr.shape[0]):
dists = [(hypot((i - node.x), (j - node.y)), node) for node in peaks.values()]
closest = sorted(dists, key=lambda _x: _x[0])
# intersections
diff_1_2 = abs(closest[0][0] - closest[1][0])
diff_1_3 = abs(closest[0][0] - closest[2][0])
diff_2_3 = abs(closest[1][0] - closest[2][0])
if diff_1_3 <= max_int_diff and diff_2_3 <= max_int_diff and diff_1_2 <= max_int_diff:
intersections.append((i, j))
# edge of graph valleys bound the graph, so they are counted as intersections
elif diff_1_2 <= max_valley_diff \
and (i <= max_edge_dist or i >= array_size - max_edge_dist
or j <= max_edge_dist or j >= array_size - max_edge_dist):
intersections.append((i, j))
# valleys
elif diff_1_2 <= max_valley_diff:
valleys.append((i, j))
Update 2
I've now managed to use ndimage.minimum_filter
to accurately find the majority of the valley points :) I combined individual axis processing to get the below result. Still missing some points though. Any thoughts on how I can improve this would be appreciated. I've tried changing the size up/down but this is the best result I can get.
My current idea to find the missing points is to add two more filters that look at the two 45 degree axes, as it seems to be the angle of the valleys which is affecting it.
# using scipy.ndimage.minimum_filter
minima_0 = (arr != minimum_filter(arr, axes=0, size=10, mode='constant', cval=20.0))
minima_1 = (arr != minimum_filter(arr, axes=1, size=10, mode='constant', cval=20.0))
# return a masked arr, where minima have been replaced with a value
arr_0 = np.where(minima_0, arr, 50)
arr_1 = np.where(minima_1, arr, 50)
mg = arr_0 + arr_1
Update 2
I've posted an answer. The real issue was my own lack of knowledge in mathematics, due to having never studied it formally beyond high school. Initially I was overcomplicating it by presenting it as an image processing problem, even though my actual data was just an array of points.
Hopefully this all might help someone else!
The answer in the end was indeed Voronoi processing, as suggested by both @MarkSetchell and @ChristophRackwitz. Thanks to both of you for your comments, I wouldn't have found the answer without your help!
Essentially, what I've been looking for is called the Voronoi Diagram.
Refer to this link for helpful examples using Scipy
:
https://docs.scipy.org/doc/scipy/tutorial/spatial.html#voronoi-diagrams
I based some of my own code from examples on that page, particularly the infinite edges which were tricky.
As mentioned, I already have the coordinates of the peak points. From there it's very simple to generate the Voronoi Diagram:
from scipy.spatial import Voronoi
points = np.array([[612, 578], [487, 958], [324, 972], [370, 886], [884, 435], [64, 557]])
vor = Voronoi(points)
The Voronoi object can then be used to get the intersections and edges, noting that bounded edges and infinite edges are treated differently.
def voronoi_intersections(voronoi):
"""Returns the Voronoi intersections. This does not include the imaginary 'end' points of the unbounded edges."""
return list(voronoi.vertices)
def voronoi_bounded_edges(voronoi):
"""Returns all bounded Voronoi edges from the Voronoi Diagram as indices into the points array."""
arr = np.array(voronoi.ridge_vertices)
mask = np.all(arr >= 0, axis=1)
masked = arr[mask]
return masked
def voronoi_all_edges(voronoi, points):
"""Gets all Voronoi edges from the Voronoi Diagram as indices into the points array, including infinite edges.
See: https://docs.scipy.org/doc/scipy/tutorial/spatial.html#voronoi-diagrams:~:text=The%20ridges%20extending%20to%20infinity%20require%20a%20bit%20more%20care%3A
"""
all_voronoi_vertices = voronoi_intersections(voronoi) # will add the interpolated 'far' points to this
all_voronoi_edges = list(voronoi_bounded_edges(voronoi)) # will add the infinite edges to this
center = points.mean(axis=0)
for pointidx, simplex in zip(voronoi.ridge_points, voronoi.ridge_vertices):
simplex = np.asarray(simplex)
if np.any(simplex < 0):
i = simplex[simplex >= 0][0] # finite end Voronoi vertex
t = points[pointidx[1]] - points[pointidx[0]] # tangent
t = t / np.linalg.norm(t)
n = np.array([-t[1], t[0]]) # normal
midpoint = points[pointidx].mean(axis=0)
# In the below I've multiplied the vector 10x to lengthen the returned edge well outside the bounds
# of the instance. I believe 10x is enough. This is a bit hacky, but I couldn't work out why it wasn't
# matching the source code plotting output.
far_point = voronoi.vertices[i] + 10 * np.sign(np.dot(midpoint - center, n)) * n * 100
all_voronoi_vertices.append(far_point)
all_voronoi_edges.append(np.array([i, len(all_voronoi_vertices) - 1]))
And here's the plot of the peaks and the edges, not including the intersections: