keraslstmdeep-residual-networks

residual LSTM layers


I have trouble understanding tensor behaviour in LSTM layers in keras.

I have preprocessed numeric data that looks like [samples, time steps, featues]. So 10 000 samples, 24 time steps and 10 predictors.

I want to stack residual connections but I am not sure that I am doing it right:

x <- layer_input(shape = c(24,10))

x <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)

Now shape of x, which is tensor, is [?,?,32]. I was expecting [?,32,10]. Should I reshape the data to be [samples, features, time steps]? Then I form res:

y <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)

res <- layer_add(c(x, y))

Now I am not sure if this is correct, or maybe should I go with that instead

x <- layer_input(shape = c(24,10))

y <- layer_lstm(x,units=24,activation="tanh",return_sequences=T) # same as time_steps

res <- layer_add(c(x,y)) ## perhaps here data reshaping is neccesary?

Any insight is much appreciatied.

JJ


Solution

  • LSTM layer will return you dims as (?,seq_length,out_dims), where out_dims is units in your case. So overall dims will be as

    x <- layer_input(shape = c(24,10))
    # dims of x (?,24,10)
    x <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)
    # dims of x after lstm_layer (?,24,32)
    
    y <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)
    # dims of y (?,24,32)
    res <- layer_add(c(x, y))
    # dims of res will be (?,24,32), it is addion of output of both lstm_layer.
    

    For more info, you can check-this