I want to enforce symmetry in my model like f(x,y) = f(y,x)
and would like to achieve it by defining two models model1(x,y)
and model2(y,x)
but let them have same architecture and parameters. I could share their parameters by
for name, param1 in model1.named_parameters():
if name in model2.state_dict() and param1.shape == model2.state_dict()[name].shape:
model2.state_dict()[name].data = param1.data
Does it make sense to share parameters this way? Do I have to define two optimizers for each model? Does
optimizer = optim.Adam(list(model1.parameters()) + list(model2.parameters())) # Combine parameters
make sense since the models share parameters, assuming one of my losses is
criterion(model1(x,y), model2(y,x))
I would appreciate a minimal working example supposing my models have dynamic layer configurations.
What is not clear in your question is whether you want symmetry on one model ("enforce symmetry in my model") or between two (implied when you introduce model1
and model2
).
If you want to enforce symmetry on a single model, you can minimize some distance between model(x,y)
and model(y,x)
? In that case you only require one model, and one optimizer and
enforce that property on model
, and not on two distinct models.
Besides, I don't clearly see what you would get by having model1
and model2
distinct: model1
would model2
be "symmetric"... model2
defined as x,y: model1(y,x)
would suffice too no? Maybe there is more to it and you indeed want two models?