I am trying to run Density-Based Spatial Clustering (DBSCAN) on a Point Cloud dataset which is a series of points with x,y,z coordinates. One of the parameters in min distance. How do I find the minimal distance between a point and another in space in Python? Many thanks!
First you can write a function that computes the euclidian distance between two points represented as numpy arrays :
import numpy as np
distance = lambda p1, p2: np.sqrt(np.sum((p1 - p2) ** 2, axis=0))
I can't think of anything better than the naive O(n²) to find the minimum distance :
import itertools
def min_distance(cloud):
pairs = itertools.combinations(cloud, 2)
return np.min(map(lambda pair: distance(*pair), pairs))
Finally, you just need to get the points from your file, I will assume that it looks like this :
cloud.csv
x, y, z
1.2, 3.4, 2.55
2.77, 7.34, 23.4
5.66, 64.3, 4.33
def get_points(filename):
with open(filename, 'r') as file:
rows = np.genfromtxt(file, delimiter=',', skip_header=True)
return rows
import itertools
import numpy as np
distance = lambda p1, p2: np.sqrt(np.sum((p1 - p2) ** 2, axis=0))
def min_distance(cloud):
pairs = itertools.combinations(cloud, 2)
return np.min(map(lambda pair: distance(*pair), pairs))
def get_points(filename):
with open(filename, 'r') as file:
rows = np.genfromtxt(file, delimiter=',', skip_header=True)
return rows
filename = 'cloud.csv'
cloud = get_points(filename)
min_dist = min_distance(cloud)
print(min_dist)
Output
21.277006368378046
As Amiga500 points out, it is possible to use scipy.spatial.distance
. We can then rewrite min_distance
as follows:
import numpy as np
from scipy.spatial.distance import pdist
min_distance = lambda cloud: np.min(pdist(cloud))