I have several pairs of vectors (arranged as two matrices) and I want to compute the vector of their pairwise correlation coefficients (or, better yet, angles between them - but since correlation coefficient is its cosine, I am using
numpy.corrcoef
):
np.array([np.corrcoef(m1[:,i],m2[:,i])[0,1]
for i in range(m1.shape[1])])
I wonder if there is a way to "vectorize" this, i.e., avoid calling corrcoef
several times.
Instead of using np.corrcoef
, you can write your own function that does the same thing. The calculation for the correlation coefficient of two vectors is quite simple:
Applying that here:
def vec_corrcoef(X, Y, axis=1):
Xm = np.mean(X, axis=axis, keepdims=True)
Ym = np.mean(Y, axis=axis, keepdims=True)
N = np.sum((X - Xm) * (Y - Ym), axis=axis)
D = np.sqrt(np.sum((X - Xm)**2, axis=axis) * np.sum((Y - Ym)**2, axis=axis))
return N / D
To test:
m1 = np.random.random((100, 10))
m2 = np.random.random(m1.shape)
a = vec_corrcoef(m1, m2)
b = [np.corrcoef(v1, v2)[0, 1] for v1, v2 in zip(m1, m2)]
print(np.allclose(a, b)) # True