I need to make a map of distances to the nearest object. I have a solution where i am looping over every point of a map, and every object, calculating the distance to all of them, and then leaving only minimum distance. The problem here is that if I am woking with real data, the map can easily contain 10s of millions of points, and there can be more than 100 objects. Is there any better code implementation for solving this problem?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
coord_dict = {"X": [],
"Y": []}
for x_value in range(0, 10000, 50):
for y_value in range(0, 5000, 50):
coord_dict["X"].append(x_value)
coord_dict["Y"].append(y_value)
map_df = pd.DataFrame(coord_dict)
well_points_dict = {"X": [500, 1500, 4000, 5500, 6250, 7500, 8000, 9000],
"Y": [500, 4000, 2000, 1500, 500, 5000, 100, 2500]}
wells_df = pd.DataFrame(well_points_dict)
calculations_count = 0
distance_map = np.zeros(map_df.shape)
for i in range(map_df.shape[0]):
d = []
for j in range(wells_df.shape[0]):
d.append(((map_df["X"].iloc[i]-wells_df["X"][j])**2 + (map_df["Y"].iloc[i]- wells_df["Y"][j])**2)**0.5)
calculations_count += 1
dd = min(d)
distance_map[i,1] = dd
# print(calculations_count)
plt.figure(figsize=(10,10))
plt.scatter(x=map_df["X"],y=map_df["Y"],c=distance_map[:,1],s=1,cmap='terrain')
for i in range(len(wells_df)):
plt.plot(wells_df["X"][i],wells_df["Y"][i], color='black', marker='o',markersize=3)
plt.title('Calculated map')
plt.xlabel('X')
plt.ylabel('Y')
plt.axis('scaled')
plt.tight_layout()
plt.colorbar(shrink=0.25)
Result map example:
KDTree is what you are looking for, there is an implementation of it in scipy.spatial
.
import numpy as np
import matplotlib.pyplot as plt
from scipy import spatial
Given your trial points:
x_value = np.arange(0, 10000, 50)
y_value = np.arange(0, 5000, 50)
X, Y = np.meshgrid(x_value, y_value)
points = np.stack([X.ravel(), Y.ravel()]).T
And well points:
x_well = np.array([500, 1500, 4000, 5500, 6250, 7500, 8000, 9000])
y_well = np.array([500, 4000, 2000, 1500, 500, 5000, 100, 2500])
wells = np.stack([x_well, y_well]).T
We can create a KDTree:
interpolator = spatial.KDTree(wells)
And query efficiently the tree to get distances and also indices of which point it is closer:
distances, indices = interpolator.query(points)
# 7.12 ms ± 711 µs per loop (mean ± std. dev. of 30 runs, 100 loops each)
Plotting the result leads to:
fig, axe = plt.subplots()
axe.scatter(*points.T, marker=".", c=distances)
axe.scatter(*wells.T, color="black")
axe.grid()
We see the Voronoi diagram appearing on the color map which is a good confirmation distances are correctly interpreted wrt reference points (wells):
voronoi = spatial.Voronoi(wells)
# ...
spatial.voronoi_plot_2d(voronoi, ax=axe)
Where the object Voronoi
is the Voronoi diagram based on your reference points (wells) and voronoi_plot_2d
an helper to draw it on axes.