matlabmatrixvectorizationeuclidean-distance

How to compute the Euclidean distance between two complex matrix by vectorization?


X=[x_1,x_2,...,x_N] is a [S,N] complex matrix. For example, S=3 and x_1=[1+2j,2+3j,3+4j]'.

D is the distance matrix of X, which means D(i,j) is the Euclidean distance between x_i and x_j.

my code:

D = zeros(N, N);
for i = 1:N
    for j = i:N 
        D(i, j) = norm(X(:, i) - X(:, j));
    end
end
D = D + D.';

How to simplify it by vectorization?

I try using two loops but it costs time when N is large. And the complex matrix can not be computed by pdist.


Solution

  • To vectorize this code, you need to build an intermediate 3D matrix:

    D = vecnorm(reshape(X, S, 1, N) - X);
    D = reshape(D, N, N);  % or just squeeze(D)
    

    This takes up lots of space, but since S=3, the additional space is not prohibiting and doesn't seem to slow down the computation. For much larger S, vectorized code would use the cache less efficiently because of the larger memory footprint.

    Note that loops used to be slow a long time ago, but this is no longer the case. In very old MATLAB versions, and in Octave, it is always beneficial to avoid loops, but in modern MATLAB versions it is not. One needs to time the code to know for sure what is fastest.


    To speed up OP's code without removing loops, we can do the following:

    D = zeros(N, N);
    for i = 1:N
        for j = i+1:N  % Skip the diagonal!
            D(j, i) = vecnorm(X(:, i) - X(:, j));
        end
    end
    D = D + D.';
    

    I made three changes:

    1. When looping over a 2D matrix, the inner loop should always be the first index. This makes best use of the cache. Over X you loop correctly (indexing columns), but over D the order was reversed.
    2. Using vecnorm instead of norm. This is a relatively new function, and is faster because it's simpler (it does just one thing).
    3. Skipping the diagonal, which we know will be all zeros.