I have a tensor p
of shape (B, 3, N)
in PyTorch:
# 2 batches, 3 channels (x, y, z), 5 points
p = torch.rand(2, 3, 5, requires_grad=True)
"""
p: tensor([[[0.8365, 0.0505, 0.4208, 0.7465, 0.6843],
[0.9922, 0.2684, 0.6898, 0.3983, 0.4227],
[0.3188, 0.2471, 0.9552, 0.5181, 0.6877]],
[[0.1079, 0.7694, 0.2194, 0.7801, 0.8043],
[0.8554, 0.3505, 0.4622, 0.0339, 0.7909],
[0.5806, 0.7593, 0.0193, 0.5191, 0.1589]]], requires_grad=True)
"""
And then another z_shift
of shape [B, 1]
:
z_shift = torch.tensor([[1.0], [10.0]], requires_grad=True)
"""
z_shift: tensor([[1.],
[10.]], requires_grad=True)
"""
I want to apply the appropriate z-shift of all points in each batch, leaving x and y unchanged:
"""
p: tensor([[[0.8365, 0.0505, 0.4208, 0.7465, 0.6843],
[0.9922, 0.2684, 0.6898, 0.3983, 0.4227],
[1.3188, 1.2471, 1.9552, 1.5181, 1.6877]],
[[0.1079, 0.7694, 0.2194, 0.7801, 0.8043],
[0.8554, 0.3505, 0.4622, 0.0339, 0.7909],
[10.5806, 10.7593, 10.0193, 10.5191, 10.1589]]])
"""
I managed to do it like:
p[:, 2, :] += z_shift
for the case where requires_grad=False
, but this fails inside the forward
of my nn.Module
(which I assume is equivalent to requires_grad=True
) with:
RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.
In PyTorch, tensors directly created by users are termed leaf tensors, and their views share the same underlying storage. Performing in-place assignments on a view can modify the storage of the original tensor midst in the computational graph, leading to undefined behavior. Thus, directly assigning values to views should be avoided.
To achieve this safely, replace in-place operations with an out-of-place approach. For example:
p_shifted = torch.stack([
p[:, 0, :],
p[:, 1, :],
p[:, 2, :] + z_shift,
], dim=1)
This constructs a new tensor via torch.stack
instead of modifying the original storage in-place, ensuring computational graph integrity while fulfilling the intended functionality.