Trying to get used to pytorch syntax. Wrote a simple linear regression program.
# weights and biases
w = torch.randn(2,3, requires_grad=True)
b = torch.randn(2, requires_grad=True)
print(w.dtype, b.dtype)
# a linear function
def model(x):
return x @ w.t() + b
# loss function
def mse(t1, t2):
diff = (t1-t2)**2
return torch.mean(diff)
# do this in a loop
for _ in range(100):
preds = model(inputs_t)
loss = mse(targets_t, preds)
loss.backward()
# print(loss)
print(w.grad, b.grad)
print("==============")
with torch.no_grad():
w = w - w.grad * 1e-5
b = b - b.grad * 1e-5
w.grad = None
b.grad = None
Manually trying to zero the gradients of each variable since I read that pytorch stores all the past gradients inside the .grad variable. But I have this error:
element 0 of tensors does not require grad and does not have a grad_fn
This is the output. It runs for one loop and then stops:
tensor([[ 4958.0928, 6618.7686, 7047.0444],
[-21244.5488, -32400.4414, -35345.9766]]) tensor([ 58.3144, -298.9364])
==============
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-122-eef55c1b4837> in <cell line: 16>()
20 # b.grad.zero_()
21 loss = mse(targets_t, preds)
---> 22 loss.backward()
23 # print(loss)
24 print(w.grad, b.grad)
2 frames
/usr/local/lib/python3.10/dist-packages/torch/autograd/graph.py in _engine_run_backward(t_outputs, *args, **kwargs)
742 unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs)
743 try:
--> 744 return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
745 t_outputs, *args, **kwargs
746 ) # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
what should i change in this code to remove this error?
I tried searching the error but everyone else is either using more complex problems or its a version problem for most. I dont really understand how to interpret those posts since I am using whatever default google colabs uses for pytorch.
When you compute:
with torch.no_grad():
w = w - w.grad * 1e-5
b = b - b.grad * 1e-5
w.grad = None
b.grad = None
You are creating a new object w
that is different from the old object w
. The new w
is created under torch.no_grad()
, so the new w
doesn't have a gradient. This causes the error.
The solution is to update w
and b
with an in-place operation. There are several ways to do this:
with torch.no_grad():
for param in [w, b]:
# Option 1: in-place subtraction
param.sub_(param.grad*1e-2)
# Option 2: assign new values to the .data attribute
param_new = param - param.grad * 1e-2
param.data = param_new.data
# Option 3: use the in-place copy
param_new = param - param.grad * 1e-2
param.copy_(param_new)
# Finally, set grad to None
param.grad = None
Any one of those options will work