Is there a way to update a subset of parameters in dynet? For instance in the following toy example, first update h1
, then h2
:
model = ParameterCollection()
h1 = model.add_parameters((hidden_units, dims))
h2 = model.add_parameters((hidden_units, dims))
...
for x in trainset:
...
loss.scalar_value()
loss.backward()
trainer.update(h1)
renew_cg()
for x in trainset:
...
loss.scalar_value()
loss.backward()
trainer.update(h2)
renew_cg()
I know that update_subset
interface exists for this and works based on the given parameter indexes. But then it is not documented anywhere how we can get the parameter indexes in dynet Python.
A solution is to use the flag update = False
when creating expressions for parameters (including lookup parameters):
import dynet as dy
import numpy as np
model = dy.Model()
pW = model.add_parameters((2, 4))
pb = model.add_parameters(2)
trainer = dy.SimpleSGDTrainer(model)
def step(update_b):
dy.renew_cg()
x = dy.inputTensor(np.ones(4))
W = pW.expr()
# update b?
b = pb.expr(update = update_b)
loss = dy.pickneglogsoftmax(W * x + b, 0)
loss.backward()
trainer.update()
# dy.renew_cg()
print(pb.as_array())
print(pW.as_array())
step(True)
print(pb.as_array()) # b updated
print(pW.as_array())
step(False)
print(pb.as_array()) # b not updated
print(pW.as_array())
update_subset
, I would guess that the indices are the integers suffixed at the end of parameter names (.name()
).
In the doc, we are supposed to use a get_index
function.dy.nobackprop()
which prevents the gradient to propagate beyond a certain node in the graph..scale_gradient(0)
).These methods are equivalent to zeroing the gradient before the update. So, the parameter will still be updated if the optimizer uses its momentum from previous training steps (MomentumSGDTrainer
, AdamTrainer
, ...).