matlabmatrixpdist

matlab use my own distance function for pdist


I have a simple function to calculate the distance between two vectors, such that distance = dot product / sum of elements in the two vectors.

  function d = simpleDistance(a,b)
    d = dot(a,b)/ (sum(a) + sum(b));
  end

for example: simpleDistance([1 2], [3 4]) = (3 + 8) / (3 + 7) = 11/10 = 1.1

Given this small matrix r, I want to calculate the similarity between every two rows in r (distance = simpleDistance)

r =
 1     2
 5     0
 3     4

Instead of two nested loops, I want to use the pdist function because it is WAY FASTER!

n = size(r,1);
dist = squareform(pdist(r,@simpleDistance)); % distance matrix
dist(1:n+1:end) = inf; % self-distance doesn't count

However, I get this error

Error using pdist (line 373)
Error evaluating distance function 'simpleDistance'.

Caused by:
    Error using dot (line 34)
    A and B must be same size.

for the matrix r above, I expect dist matrix to be

dist =    
          Inf        0.625          1.1
        0.625          Inf         1.25
          1.1         1.25          Inf

Note: after looping or filling the matrix, I fill the diagonal values with inf as I don't care about the distance from a row to itself.


Solution

  • The function you pass to pdist must take

    as arguments a 1-by-n vector XI, corresponding to a single row of X, and an m2-by-n matrix XJ, corresponding to multiple rows of X. distfun must accept a matrix XJ with an arbitrary number of rows. distfun must return an m2-by-1 vector of distances d2, whose kth element is the distance between XI and XJ(k,:)

    So:

    d = sum(bsxfun(@times,a,b),2) ./ (sum(a,2) + sum(b,2));