I am trying to follow this guide to implement a seq2seq machine tranlsation model: https://www.tensorflow.org/tutorials/text/nmt_with_attention
The tutorial's Encoder
has an initialize_hidden_state()
function that is used to generate all 0 as initial state for the encoder. However I am a bit confused as to why this is neccessary. As far as I can tell, the only times when encoder
is called (in train_step and evaluate), they were initialized with the initialize_hidden_state()
function. My questions are 1.) what is the purpose of this initial state? Doesn't Keras layer automatically initialize LSTM states to begin with? And 2.) why not always just initialize the encoder
with all 0 hidden states if encoder is always called with initial states generated by initialize_hidden_state()
?
you are totally right. The code in the example is a little misleading. The LSTM cells are automatically initialized with zeros. You can just delete the initialize_hidden_state()
function.