a=np.arange(8).reshape(2,2,2)
b=np.arange(4).reshape(2,2)
print(np.matmul(a,b))
the Result is:
[[[ 2 3]
[ 6 11]]
[[10 19]
[14 27]]]
I don't understand this result, can someone please explain it?
Short answer: it "broadcasts" the second 2d matrix to a 3d matrix, and then performs a "mapping" so, it maps the elementwise submatrices to new submatrices in the result.
As the documentation on np.matmul
[numpy-doc] says:
numpy.matmul(a, b, out=None)
Matrix product of two arrays.
The behavior depends on the arguments in the following way.
- If both arguments are 2-D they are multiplied like conventional matrices.
- If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
- If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.
- If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.
So here the second item is applicable. So first the second matrix is "broadcasted" to the 3d variant as well, so that means that we multiple:
array([[[0, 1],
[2, 3]],
[[4, 5],
[6, 7]]])
with:
array([[[0, 1],
[2, 3]],
[[0, 1],
[2, 3]]])
and we see these as stacked matrices. So first we multiply:
array([[0, 1], array([[0, 1],
[2, 3]]) x [2, 3]])
which gives us:
array([[ 2, 3],
[ 6, 11]])
and then the elementwise second submatrices:
array([[4, 5], array([[0, 1],
[6, 7]]) x [2, 3]])
an this gives us:
array([[10, 19],
[14, 27]])
we thus stack these together into the result, and obtain:
>>> np.matmul(a, b)
array([[[ 2, 3],
[ 6, 11]],
[[10, 19],
[14, 27]]])
Although the behavior is thus perfectly defined, it might be better to use this feature carefully, since there are other "sensical" definitions of what a "matrix product" on 3d matrices with 2d matrices might look like, and these are thus not used here.