suppose I have two arrays:
import numpy as np
a=np.array([[1,2],
[3,4]])
b=np.array([[1,2],
[3,4]])
and I want to element-wise multiply the arrays then sum the elements, i.e. 1*1 + 2*2 + 3*3 + 4*4 = 30
, I can use:
np.tensordot(a, b, axes=((-2,-1),(-2,-1)))
>>> array(30)
Now, suppose arrays a
and b
are 2-by-2-by-2 arrays:
a=np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
b=np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
and I want to do the same operation for each group, i.e. [[1,2],[3,4]]
times with [[1,2],[3,4]]
then sums the elements, and the same with [[5,6],[7,8]]
. The result should be array([ 30, 174])
, where 30 = 1*1 + 2*2 + 3*3 + 4*4
and 174 = 5*5 + 6*6 + 7*7 + 8*8
. Is there a way to do that using tensordot?
P.S.
I understand in this case you can simply use sum or einsum:
np.sum(a*b,axis=(-2,-1))
>>> array([ 30, 174])
np.einsum('ijk,ijk->i',a,b)
>>> array([ 30, 174])
but this is merely a simplified example, I need to use tensordot
because it's faster.
Thanks for any help!!
You can use: np.diag(np.tensordot(a, b, axes=((1, 2), (1, 2))))
to get the result you want. However, using np.tensordot
or a matrix multiplication is not a good idea in you case as they do much more work than needed. The fact that they are efficiently implemented does not balance the fact that they do much more computation than needed (only the diagonal is useful here). np.einsum('ijk,ijk->i',a,b)
does not compute more things than needed in your case. You can try the optimize=True
or even optimize='optimal'
since the parameter optimize
is set to False
by default. If this is not fast enough, you can try to use NumExpr so to compute np.sum(a*b,axis=(1, 2))
more efficiently (probably in parallel). Alternatively, you can use Numba or Cython too. Both supports fast parallel loops.