I am learning pytorch
and deep learning. The documentation for torch.max
doesn't make sense in that it looks like we can compare 2 tensors but I don't see where in the documentation I could have determined this.
I had this code at first where I wanted to check ReLU values against the maximum. I thought that 0
could be broadcast for and h1.shape=torch.Size([10000, 128])
.
h1 = torch.max(h1, 0)
y = h1 @ W2 + b2
However, I got this error:
TypeError: unsupported operand type(s) for @: 'torch.return_types.max' and 'Tensor'
I got to fix this when I changed the max
equation to use a tensor instead of 0.
h1 = torch.max(h1, torch.tensor(0))
y = h1 @ W2 + b2
1. Why does this fix the error?
This is when I checked the documentation again and realized that there is nothing mentions a collection like a tuple or list for multiple tensors or even a *input
for iterable unpacking.
Here are the 2 versions:
1st torch.max
version:
torch.max(input) → Tensor Returns the maximum value of all elements in the input tensor.
Warning
This function produces deterministic (sub)gradients unlike max(dim=0)
2nd version of torch.max
torch.max(input, dim, keepdim=False, *, out=None) Returns a namedtuple (values, indices) where values is the maximum value of each row of the input tensor in the given dimension dim. And indices is the index location of each maximum value found (argmax).
If keepdim is True, the output tensors are of the same size as input except in the dimension dim where they are of size 1. Otherwise, dim is squeezed (see torch.squeeze()), resulting in the output tensors having 1 fewer dimension than input.
2. What is tensor(0)
according to this documentation?
Check the bottom of the documentation page you linked where it says:
torch.max(input, other, *, out=None) → Tensor
See torch.maximum().
When the second argument is a tensor, torch.max
computes torch.maximum