I'm migrating my allennlp model from classes to config and there's one last construct I'm having problems with.
I'm using a feedforward projection layer in my LSTM CRF decoder, i.e.
vocab_size = vocab.get_vocab_size("tokens")
feedforward = FeedForward(
input_dim=encoder.get_output_dim(),
num_layers=2,
hidden_dims=[text_field_embedder.get_output_dim(), vocab_size],
activations=[Activation.by_name(Activation.by_name("relu"),Activation.by_name("linear")(),],
dropout=[0.15,0.15],
)
model = CrfTagger(
vocab=vocab,
text_field_embedder=text_field_embedder,
encoder=encoder,
feedforward=feedforward,
)
The issue I'm running into is how to express the last hidden dim size (vocab_size) in json, since it's dependent on the runtime value of vocab.get_vocab_size("tokens")
?
It seems that I need to either construct FeedForward inside CrfTagger (so I have access to vocab at runtime) or create my own FeedForward derived class.
I'm wondering if there's a cleaner way, is there a way I can register a constructor for FeedForward (essentially a factory function)?
Great question! From my exploration (and the investigation of another person on the team), it doesn't look like there's currently a clean way to dynamically create a feedforward
directly from a config file with output_dim = vocab_size
. It seems like the best option is to create a feedforward
subclass that is lazily constructed with vocab_size
.