[SOLVED] residual LSTM layers

residual LSTM layers

I have trouble understanding tensor behaviour in LSTM layers in keras.

I have preprocessed numeric data that looks like [samples, time steps, featues]. So 10 000 samples, 24 time steps and 10 predictors.

I want to stack residual connections but I am not sure that I am doing it right:

x <- layer_input(shape = c(24,10))

x <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)

Now shape of x, which is tensor, is [?,?,32]. I was expecting [?,32,10]. Should I reshape the data to be [samples, features, time steps]? Then I form res:

y <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)

res <- layer_add(c(x, y))

Now I am not sure if this is correct, or maybe should I go with that instead

x <- layer_input(shape = c(24,10))

y <- layer_lstm(x,units=24,activation="tanh",return_sequences=T) # same as time_steps

res <- layer_add(c(x,y)) ## perhaps here data reshaping is neccesary?

Any insight is much appreciatied.

Solution

LSTM layer will return you dims as (?,seq_length,out_dims), where out_dims is units in your case. So overall dims will be as

x <- layer_input(shape = c(24,10))
# dims of x (?,24,10)
x <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)
# dims of x after lstm_layer (?,24,32)

y <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)
# dims of y (?,24,32)
res <- layer_add(c(x, y))
# dims of res will be (?,24,32), it is addion of output of both lstm_layer.

For more info, you can check-this