pythonarraysnumpyeuclidean-distance

Can this similarity measure between different size NumPy arrays be expressed entirely with NumPy API?


This script:

import numpy as np
from numpy.linalg import norm

a = np.array([(1, 2, 3), (1, 4, 9), (2, 4, 4)])
b = np.array([(1, 3, 3), (1, 5, 9)])
r = sum([min(norm(a-e, ord=1, axis=1)) for e in b])

computes a similarity measure r between different size NumPy arrays a and b. Is there a way to express it entirely with NumPy API for greater efficiency?


Solution

  • You can do this:

    r = np.linalg.norm(a[:,None,:]-b[None], ord=1, axis=2).min(0).sum()
    

    Here, a[:,None,:] - b[None] will add extra dimensions to a and b for the proper subtraction broadcasting. The norm is unchanged, the .min(0) takes the row-wise minimum, and then .sum() gets the final result.