pythonnumpyself-organizing-maps

Numpy - Finding matches across multiple co-ordinates


I'm using somoclu to produce an emergent Self-Organising Map of some data. Once I have the BMUs (Best Matching Units) I'm performing a Delaunay Triangulation on the co-ordinates of the BMUs in order to find each BMU's neighbours in the SOM.

In the following snippet of Python, is there a more Pythonic version of the a == c and b == d conditional? In other words, how can I compare bmu and point directly without splitting out the separate co-ordinates?

points = np.unique(np.array(som.bmus), axis = 0)
for idx, bmu in enumerate(som.bmus):
    a, b = bmu
    for point_idx, point in enumerate(points):
        c, d = point
        if a == c and b == d: # More Pythonic version of this line?
            print (idx, point_idx)
            break

Solution

  • Approach #1

    We are working with NumPy arrays, so we can leverage broadcasting for a vectorized solution -

    ar = np.array(som.bmus)
    points = np.unique(ar, axis = 0)
    
    mask = (ar[:,0,None]==points[:,0]) & (ar[:,1,None]==points[:,1])
    indices = np.argwhere(mask)
    

    Approach #1-G

    One more compact way to get mask, which covers for a generic no. of columns in ar, would be -

    mask = (ar[:,None] == points).all(axis=2)
    

    Approach #2

    Another memory-efficient approach for generic no. of cols would be with views and np.searchsorted -

    # https://stackoverflow.com/a/45313353/ @Divakar
    def view1D(a, b): # a, b are arrays
        a = np.ascontiguousarray(a)
        b = np.ascontiguousarray(b)
        void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
        return a.view(void_dt).ravel(),  b.view(void_dt).ravel()
    
    n = len(ar)
    indices = np.empty((n,2),dtype=int)
    indices[:,0] = np.arange(n)
    a,b = view1D(ar, points) # ar, points from app#1
    indices[:,1] = np.searchsorted(b, a)