I have an array of points:
centers = array([0,0,0,0],
[0,1,0,0],
[0,0,0,1],
[1,0,0,0])
Using scikit.ndimage.distance_transform_edt(), the index of the point closest to each background element is returned in a separate array:
indexes = array([[1,1],[1,1],[1,1],[3,2]],
[[1,1],[1,1],[1,1],[3,2]],
[[0,3],[1,1],[3,2],[3,2]],
[[0,3],[0,3],[0,3],[3,2]])
I then label my original array to have unique values:
labeled_centers = array([0,0,0,0],
[0,2,0,0],
[0,0,0,3],
[4,0,0,0])
My question is, what would be the most efficient way to use my array of indexes to create a new array, where every point is labeled with the label of the closest center point?
labeled_image = array([2,2,2,1,1],
[2,2,2,3,1],
[4,2,3,3,3],
[4,4,4,3,3])
So far I have achieved my desired result by looping through each value in the array, like so:
classed = np.zeros_like(centers)
for x in range(classed.shape[0]):
for y in range(classed.shape[1]):
classed[x,y] = labeled_centers[indexes[0,x,y],indexes[1,x,y]]
However, while my example is a small array, my real use case would involve arrays of millions if not billions of datapoints, so I am trying to avoid writing a per-value "for" loop. Is there a more efficient way to achieve the same result?
First changing your code to something that actually works:
import numpy as np
centers = np.array([[0,0,0,0],
[0,1,0,0],
[0,0,0,1],
[1,0,0,0]])
indexes = np.array([[[1,1],[1,1],[1,1],[3,2]],
[[1,1],[1,1],[1,1],[3,2]],
[[0,3],[1,1],[3,2],[3,2]],
[[0,3],[0,3],[0,3],[3,2]]])
labeled_centers = np.array([[0,0,0,0],
[0,2,0,0],
[0,0,0,3],
[4,0,0,0]])
classed = np.zeros_like(centers)
for x in range(classed.shape[0]):
for y in range(classed.shape[1]):
classed[x,y] = labeled_centers[indexes[x,y,1],indexes[x,y,0]]
Now do the same as a simple and efficient vectorized 1-liner using advanced indexing:
classed = labeled_centers[indexes[:,:,1],indexes[:,:,0]]
print(classed)
result:
[[2 2 2 3]
[2 2 2 3]
[4 2 3 3]
[4 4 4 3]]