arr1 = np.arange(8).reshape(4, 2)
arr2 = np.arange(4, 12).reshape(2, 4)
ans=np.tensordot(arr1,arr2,axes=([1],[0]))
ans2=np.tensordot(arr1,arr2,axes=([0],[1]))
ans3 = np.tensordot(arr1,arr2, axes=([1,0],[0,1]))
I am trying to understand how this tensordot function work . I know that it returns the tensordot product.
but axes part is a little bit difficult for me to comprehend. what i have observed that
for ans it is like the number of columns in array arr1 and the number of rows in arr2 makes the final matrix.
for ans2 it is the other way around number of columns in arr2 and number of rows in arr1
i dont understand axes=([1,0],[0,1]). let me know if my understanding for ans and ans2 is correct
You forgot to show the arrays:
In [87]: arr1
Out[87]:
array([[0, 1],
[2, 3],
[4, 5],
[6, 7]])
In [88]: arr2
Out[88]:
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [89]: ans
Out[89]:
array([[ 8, 9, 10, 11],
[ 32, 37, 42, 47],
[ 56, 65, 74, 83],
[ 80, 93, 106, 119]])
In [90]: ans2
Out[90]:
array([[ 76, 124],
[ 98, 162]])
In [91]: ans3
Out[91]: array(238)
ans
is just the regular dot, matrix product:
In [92]: np.dot(arr1,arr2)
Out[92]:
array([[ 8, 9, 10, 11],
[ 32, 37, 42, 47],
[ 56, 65, 74, 83],
[ 80, 93, 106, 119]])
The dot
sum-of-products is performed on ([1],[0])
axis 1 of arr1
, and axis 0 of arr2
(the conventional across the columns, down the rows). With 2d 'sum across ...' phrase can be confusing. It's clearer when dealing with 1 or 3d arrays. Here the matching size 2 dimensions are summed, leaving the (4,4).
ans2
reverses them, summing on the 4's, producing a (2,2):
In [94]: np.dot(arr2,arr1)
Out[94]:
array([[ 76, 98],
[124, 162]])
tensordot
has just transposed the 2 arrays and performed a regular dot
:
In [95]: np.dot(arr1.T,arr2.T)
Out[95]:
array([[ 76, 124],
[ 98, 162]])
ans3
is uses a transpose and reshape (ravel
), to sum on both axes:
In [98]: np.dot(arr1.ravel(),arr2.T.ravel())
Out[98]: 238
In general, tensordot
uses a mix of transpose and reshape to reduce the problem to a 2d np.dot
problem. It may then reshape and transpose the result.
I find the dimensions control of einsum
to be clearer:
In [99]: np.einsum('ij,jk->ik',arr1,arr2)
Out[99]:
array([[ 8, 9, 10, 11],
[ 32, 37, 42, 47],
[ 56, 65, 74, 83],
[ 80, 93, 106, 119]])
In [100]: np.einsum('ji,kj->ik',arr1,arr2)
Out[100]:
array([[ 76, 124],
[ 98, 162]])
In [101]: np.einsum('ij,ji',arr1,arr2)
Out[101]: 238
With the development of einsum
and matmul/@
, tensordot
has become less necessary. It's harder to understand, and doesn't have any speed or flexibility advantages. Don't worry about understanding it.
ans3
is the trace (sum of diagonal) of the other 2 ans:
In [103]: np.trace(ans)
Out[103]: 238
In [104]: np.trace(ans2)
Out[104]: 238