I know how Bidirectional()
work when return_sequences=True
:
model.add(Bidirectional(LSTM(20, return_sequences=True)
but what im confused at is:
when return_sequences=false
in LSTM
, there is no output at each timestep
to be combined , so how this line work?:
model.add(Bidirectional(LSTM(20, return_sequences=False)))
model.add(Bidirectional(LSTM(20)))
for example in this code from keras team (note that LSTM return_sequences=False
by default)
my Question is:
is it correct to say when return_sequences=False
, Bidirectional()
act like this:
output of forward pass of LSTM
+ "a single time step" in backward direction
so it is effectively a forward pass of LSTM(x1...xn)
+ single step LSTM(xn)
am i right?
=========================================================
Update:
i think it find the answer but im not sure.
when return_sequences=False
there is no intermediate output at each timestep
so a "complete forward pass" + "complete backward pass" should be combined.
i.e. the last output of forward + last output of backward
LSTM(x1...xn)
+ LSTM(xn...x1)
The answer in your update is correct.
When return_sequences=True, an output is generated for each timestep. So if there are 5 LSTM Cells in your layer, there will be 5 outputs, one per cell.
When return_sequences=False, only the last output of the forward pass (located at timestep T-1) AND the last output of the backward pass (located at timestep 0) are returned.
In both cases, the outputs are merged in some defined way, e.g concat, sum, etc