pythonscipydelaunayscipy-spatial

Speed up "find_simplex" from scipy.spatial.Delauna


I built an application using the Delaunay triangulation from Scypi. In order to validate it, I want to do a Hold-One-Out test, which means that the code snippet mentioned below gets called a lot of times (~1e9). Thus, I want to make it as fast as possible.

Here is the minimum working example I want to speed up:

from scipy.spatial import Delaunay as Delaunay
import numpy as np
import time

n_pts = 100      # around 1e9 in the real application
pts = np.random.random((n_pts, 2))

t = time.time()
for i in range(n_pts):
    delaunay = Delaunay(pts[np.arange(n_pts)!=i])
    simplex = delaunay.find_simplex(pts[i])
print(time.time()-t)

Most of the time is used up by the find_simplex method, around 200ms of 300ms on my machine. Is there any way to speed it up? I already took a look at the qhull_options in the Delaunay constructor, however I had no success there.

Please note that I cannot change the overall structure as the "real" program works just fine and this computation is only done for validation. Thanks a lot!


Solution

  • It's hard to say what exactly goes under the hood of the find_simplex method, but my guess is that in the first call it constructs some search structure and since you use it just once the construction initialization time isn't amortized on many queries.

    A simple solution for this is to run a linear search, without calling the find_simplex method. Since your code constructs a Delaunay triangulation each iteration, the runtime will be dominated by the triangulation construction and the linear search time is negligible.

    Here is a vectorized numpy function that does that.

     def is_in_triangle(pt, p0, p1, p2):
        """ Check in which of the triangles the point pt lies.
            p0, p1, p2 are arrays of shape (n, 2) representing vertices of triangles,
            pt is of shape (1, 2).
            Assumes p0, p1, p2 are oriented counter clockwise (as in scipy's Delaunay)
        """ 
        vp0 = pt - p0
        v10 = p1 - p0
        cross0 = vp0[:, 0] * v10[:, 1] - vp0[:, 1] * v10[:, 0]  # is pt to left/right of p0->p1
    
        vp1 = pt - p1
        v21 = p2 - p1
        cross1 = vp1[:, 0] * v21[:, 1] - vp1[:, 1] * v21[:, 0]  # is pt to left/right of p1->p2
        
        vp2 = pt - p2
        v02 = p0 - p2
        cross2 = vp2[:, 0] * v02[:, 1] - vp2[:, 1] * v02[:, 0]  # is pt to left/right of p2->p0
    
        return (cross0 < 0) & (cross1 < 0) & (cross2 < 0)  # pt should be to the left of all triangle edges
    
    

    I modified your code to run with this function:

    t = time.time()
    for i in range(n_pts):
        delaunay = Delaunay(pts[np.arange(n_pts)!=i])
    
        p0 = delaunay.points[delaunay.simplices[:, 0], :]
        p1 = delaunay.points[delaunay.simplices[:, 1], :]
        p2 = delaunay.points[delaunay.simplices[:, 2], :]
        pt = pts[i].reshape((1, 2))
        pt_in_simps_mask = is_in_triangle(pt, p0, p1, p2)
        simp_ind_lst = np.where(pt_in_simps_mask)[0]
        if len(simp_ind_lst) == 0:
            simp_ind = -1
        else:
            simp_ind = simp_ind_lst[0]
    
    print("new time: {}".format(time.time()-t))
    

    On my machine, when running with 100 points this code run in approximately 0.036s compared to 0.13s of your original code (and 0.030s for the code without the query at all, just the Delaunay constructions).