tensorflowkeraslstmmultiple-input

keras LSTM functional API multiple inputs


I am trying to use two inputs to train an LSTM model: price and sentiment, after normalize these two data: trainX and trainS, I follow the keras document to train the mode

print(trainX.shape)
print(trainS.shape)
(22234, 1, 51) --> 51 is because these datasets are time sequence, and I look back for 51 hours of the history price data
(22285, 1)

The code basically follows Keras multiple inputs document: https://keras.io/getting-started/functional-api-guide/#all-models-are-callable-just-like-layers But I got error when I fit the model

Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[0., 0., 0., ..., 0., 0., 0.]],

       [[0., 0., 0., ..., 0., 0., 0.]],

       [[0., 0., 0., ..., 0., 0., 0.]],

       ...,

       [[0., 0., 0., ..., 0., 0., 0.]],

       [[0., 0., 0., ....

from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model

# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(trainX.shape[0],), dtype='int32', name='main_input')

# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=trainX.shape[0])(main_input)

# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)
auxiliary_output = Dense(2, activation='sigmoid', name='aux_output')(lstm_out)

import keras
auxiliary_input = Input(shape=(trainS.shape[0],), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(2, activation='sigmoid', name='main_output')(x)

auxiliary_output = Dense(2, activation='sigmoid', name='aux_output')(lstm_out)

auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(2, activation='sigmoid', name='main_output')(x)
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])

model.compile(optimizer='rmsprop', loss='binary_crossentropy',
              loss_weights=[1., 0.2])
model.fit(trainX, trainS, epochs=100, batch_size=1, verbose=2, shuffle=False)

Solution

  • The model fit call must pass a list of np.arrays such that their batch size is the same and the remaining dimensions must match what is defined for the inputs / targets.

    i.e. you need to call

    model.fit([input0, input1], [output0, output1])
    
    

    All of these need to have the same shape[0].

    I noticed the following in your code:

    main_input = Input(shape=(trainX.shape[0],)
    

    This is incorrect. You want the shape of the input to be trainX.shape[1:]. There is no need to define the batch size but you must define the other dimensions.