My question in short: Is it possible to constrain the domain of the Voronoi vertices generated by Qhull? If so, how can one do so?
My question with context: I am working on a data visualization in which I have points in a 2D field. The points overlap a bit, so I'm "jittering" them slightly so as to make them not overlap.
My current approach to this task is to use Lloyd's algorithm to move the points. Lloyd's algorithm essentially takes the initial point positions, computes a Voronoi map, and moves each point to the center of its Voronoi region during each iteration of the algorithm.
Here's an example in Python:
from scipy.spatial import Voronoi
from scipy.spatial import voronoi_plot_2d
import matplotlib.pyplot as plt
import numpy as np
import sys
class Field():
'''
Create a Voronoi map that can be used to run Lloyd relaxation on an array of 2D points.
For background, see: https://en.wikipedia.org/wiki/Lloyd%27s_algorithm
'''
def __init__(self, arr):
'''
Store the points and bounding box of the points to which Lloyd relaxation will be applied
@param [arr] arr: a numpy array with shape n, 2, where n is number of points
'''
if not len(arr):
raise Exception('please provide a numpy array with shape n,2')
x = arr[:, 0]
y = arr[:, 0]
self.bounding_box = [min(x), max(x), min(y), max(y)]
self.points = arr
self.build_voronoi()
def build_voronoi(self):
'''
Build a Voronoi map from self.points. For background on self.voronoi attrs, see:
https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.spatial.Voronoi.html
'''
eps = sys.float_info.epsilon
self.voronoi = Voronoi(self.points)
self.filtered_regions = [] # list of regions with vertices inside Voronoi map
for region in self.voronoi.regions:
inside_map = True # is this region inside the Voronoi map?
for index in region: # index = the idx of a vertex in the current region
# check if index is inside Voronoi map (indices == -1 are outside map)
if index == -1:
inside_map = False
break
# check if the current coordinate is in the Voronoi map's bounding box
else:
coords = self.voronoi.vertices[index]
if not (self.bounding_box[0] - eps <= coords[0] and
self.bounding_box[1] + eps >= coords[0] and
self.bounding_box[2] - eps <= coords[1] and
self.bounding_box[3] + eps >= coords[1]):
inside_map = False
break
# store hte region if it has vertices and is inside Voronoi map
if region != [] and inside_map:
self.filtered_regions.append(region)
def find_centroid(self, vertices):
'''
Find the centroid of a Voroni region described by `vertices`, and return a
np array with the x and y coords of that centroid.
The equation for the method used here to find the centroid of a 2D polygon
is given here: https://en.wikipedia.org/wiki/Centroid#Of_a_polygon
@params: np.array `vertices` a numpy array with shape n,2
@returns np.array a numpy array that defines the x, y coords
of the centroid described by `vertices`
'''
area = 0
centroid_x = 0
centroid_y = 0
for i in range(len(vertices)-1):
step = (vertices[i, 0] * vertices[i+1, 1]) - (vertices[i+1, 0] * vertices[i, 1])
area += step
centroid_x += (vertices[i, 0] + vertices[i+1, 0]) * step
centroid_y += (vertices[i, 1] + vertices[i+1, 1]) * step
area /= 2
centroid_x = (1.0/(6.0*area)) * centroid_x
centroid_y = (1.0/(6.0*area)) * centroid_y
return np.array([centroid_x, centroid_y])
def relax(self):
'''
Moves each point to the centroid of its cell in the Voronoi map to "relax"
the points (i.e. jitter them so as to spread them out within the space).
'''
centroids = []
for region in self.filtered_regions:
vertices = self.voronoi.vertices[region + [region[0]], :]
centroid = self.find_centroid(vertices) # get the centroid of these verts
centroids.append(list(centroid))
self.points = centroids # store the centroids as the new point positions
self.build_voronoi() # rebuild the voronoi map given new point positions
##
# Visualize
##
# built a Voronoi diagram that we can use to run lloyd relaxation
field = Field(points)
# plot the points with no relaxation relaxation
plt = voronoi_plot_2d(field.voronoi, show_vertices=False, line_colors='orange', line_alpha=0.6, point_size=2)
plt.show()
# relax the points several times, and show the result of each relaxation
for i in range(6):
field.relax() # the .relax() method performs lloyd relaxation
plt = voronoi_plot_2d(field.voronoi, show_vertices=False, line_colors='orange', line_alpha=0.6, point_size=2)
plt.show()
As we can see, during each iteration (frames 2 and 3) the points in the original dataset (frame 1, top) overlap less and less, which is great!
The trouble with this approach is I'm currently removing from the plot those points whose voronoi region boundaries lie beyond the domain of the initial dataset. (If I don't do that, the outermost points quickly shoot out into hyperspace and move very far from the rest of the points.) This ultimately means I end up discarding points, which is no good.
I thought I could solve this problem by constraining the Qhull Voronoi domain so as to only create Voronoi vertices within the original data domain.
Is it possible to constrain Qhull in this way? Any help others can offer would be greatly appreciated!
After receiving @tfinniga's excellent response I put together a blog post detailing Lloyd iteration in its bounded and unbounded forms. I also put together a little package lloyd to make it easier to run bounded Lloyd iterations on a dataset. I wanted to share these resources in case they help others pursuing related analysis.
The core problem that you're running into is that there's nothing bounding Lloyd's algorithm, so the points explode. Two ways to fix this:
Take the voronoi diagram you're getting, and manually clip it to your bounding rectangle before computing the centroid. This will give you the correct solution - you can test it against the example you linked in the wikipedia article and make sure it matches.
Add an artificial boundary of points. For example, you could add points at the 4 corners of your bounding box, or some along each side, which you don't move. These will stop the interior points from exploding. This won't give you the 'correct' result from Lloyd's algorithm, but might give you an output useful for your visualization, and is easier to implement.