python numpy geometry euclidean-distance

Calculate the euclidian distance between an array of points to a line segment in Python without for loop

I'm looking for a function to compute the euclidian distance between a numpy array of points with two coordinates (x, y) and a line segment. My goal is to have a result in under 0.01 sec for a line segment and 10k points.

I already found a function for a single point. But running a for loop is very inefficient.

I also found this function that calculates the distance to the infinite line:

def line_dists(points, start, end):
    if np.all(start == end):
        return np.linalg.norm(points - start, axis=1)

    vec = end - start
    cross = np.cross(vec, start - points)
    return np.divide(abs(cross), np.linalg.norm(vec))

It is very efficient and I would like to have a similar approach for a bounded line.

Thank you for your help.

Solution

Setup – test point P, endpoints A and B:

Take the dot-product of P - A with normalize(A - B) to obtain the signed parallel distance component s from A. Likewise with B and t.
Take the maximum of {s, t, 0} to get the clamped parallel distance component. This will only be non-zero if the point is outside the "perpendicular boundary" (Voronoi region? not sure) of the segment.
Calculate the perpendicular distance component as before, using the cross-product.
Use Pythagoras to compute the required closest distance (gray line from P to A).

The above is branchless and thus easy to vectorize with numpy:

def lineseg_dists(p, a, b):
    # Handle case where p is a single point, i.e. 1d array.
    p = np.atleast_2d(p)

    # TODO for you: consider implementing @Eskapp's suggestions
    if np.all(a == b):
        return np.linalg.norm(p - a, axis=1)

    # normalized tangent vector
    d = np.divide(b - a, np.linalg.norm(b - a))

    # signed parallel distance components
    s = np.dot(a - p, d)
    t = np.dot(p - b, d)

    # clamped parallel distance
    x = np.maximum.reduce([s, t, np.zeros(len(p))])

    # perpendicular distance component
    # switch to dot-prod method which is dimension-agnostic
    y = np.linalg.norm(p - b - t * d)

    # use hypot() to improve accuracy
    return np.hypot(x, y)