pythonautocorrelation

Is there a faster way to perform this neighbour finding operation


I'm trying to calculate Moran's I in Python (This is the underlying equation). My inputs are a coords Nx3 array containing the coordinates of each point and a Nx3 array z which contains the values minus the overall mean. The operation requires each value of z to be multiplied with every point within a set distance (here set to 1.99). My problem is that in my case N=~2 Million and so the find_neighbours operation is very slow. Is there a way I could speed this up?

def find_neighbours(coords,idx,k):
   distances = np.sqrt(np.power(coords - coords[idx], 2).sum(axis=1))
   distances[idx] = np.inf
   return np.argwhere(distances<=k)

z   = x - np.mean(x)
n   = len(coords)
A   = 0
B   = np.sum([z[idx]**2 for idx,coord in enumerate(coords)])
S_0 = 0

for idx in range(len(coords)):
    neighbours = find_neighbours(coords,idx,1.99)
    S_0 += len(neighbours)
    A += np.sum([(z[neighbour]*z[idx]) for neighbour in neighbours])
    
I = (n/S_0)*(A/B)

Solution

  • This is a classical problem with plenty of literature about. It's called Radius Neighbor Search in Three-dimensional Point Clouds . You need to store your points in a better data structure to do the search faster. I would suggest an octree.

    Check python code here and adapt to your case.

    For explanations, check this paper.