I am confused between the multiplication between two tensors using *
and matmul()
.
Below is my code:
import torch
torch.manual_seed(7)
features = torch.randn((2, 5))
weights = torch.randn_like(features)
Here, I want to multiply weights and features. so, one way to do it is as follows:
print(torch.sum(features * weights))
Output:
tensor(-2.6123)
Another way to do is using matmul()
:
print(torch.matmul(features,weights.view((5,2))))
But, here output is:
tensor([[ 2.8089, 4.6439],
[-2.3988, -1.9238]])
What I don't understand here is that why matmul()
and usual multiplication are giving different outputs, when both are same. Am I doing anything wrong here?
Edit: When, I am using feature of shape (1,5)
both * and matmul
outputs are same.
but, its not the same when the shape is (2,5)
.
When you use *
, the multiplication is elementwise, when you use torch.mm
it is matrix multiplication.
Example:
a = torch.rand(2,5)
b = torch.rand(2,5)
result = a*b
result
will be shaped the same as a
or b
i.e (2,5)
whereas considering operation
result = torch.mm(a,b)
It will give a size mismatch error, as this is proper matrix multiplication (as we study in linear algebra) and a.shape[1] != b.shape[0]
. When you apply the view operation in torch.mm
you are trying to match the dimensions.
In the special case of the shape in some particular dimension being 1, it becomes a dot product and hence sum (a*b)
is same as mm(a, b.view(5,1))