pythontensorflowkeraslstmrecurrent-neural-network

What is the rule to know how many LSTM cells and how many units in each LSTM cell do you need in Keras?


I know that a LSTM cell has a number of ANNs inside.

But when defining the hidden layer for the same problem, I have seen some people using only 1 LSTM cell and others use 2, 3 LSTM cells like this -

model = Sequential()
model.add(LSTM(256, input_shape=(n_prev, 1), return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(128, input_shape=(n_prev, 1), return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(64, input_shape=(n_prev, 1), return_sequences=False))
model.add(Dropout(0.3))
model.add(Dense(1))
model.add(Activation('linear'))
  1. Is there any rule as to how many LSTM cells you should take? Or its just manual experimenting?
  2. Another question following this is, how many units you should take in an LSTM cell. Like some people take 256, some take 64 for the same problem.

Solution

  • There are no "rules", but there are guidelines; in practice, you'd experiment with depth vs. width, each of which works differently:

    In general, width extracts more features, whereas depth extracts richer features - but if there aren't many features to extract from given data, width should be lessened - and the "simpler" the data/problem, the less layers are suitable. Ultimately, however, it may be best to spare extensive analysis and try different combinations of each -- see this SO for more info.

    Lastly, avoid Dropout and use LSTM(recurrent_dropout=...) instead (see linked SO).