pythontensorflowkeras-layerseq2seqlstm-stateful

Input 0 of layer lstm_35 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1966, 7059, 256]


I am creating a seq2seq model on word level embeddings for text summarisation and I am facing data shapes issue please help. thanks.

        encoder_input=Input(shape=(max_encoder_seq_length,))
        embed_layer=Embedding(num_encoder_tokens,256,mask_zero=True)(encoder_input)
        encoder=LSTM(256,return_state=True,return_sequences=False)
        encoder_ouput,state_h,state_c=encoder(embed_layer)
        encoder_state=[state_h,state_c] 
        decoder_input=Input(shape=(max_decoder_seq_length,))
        de_embed=Embedding(num_decoder_tokens,256)(decoder_input)
        decoder=LSTM(256,return_state=True,return_sequences=True)
        decoder_output,_,_=decoder(de_embed,initial_state=encoder_state)
        decoder_dense=Dense(num_decoder_tokens,activation='softmax')
        decoder_output=decoder_dense(decoder_output)
        model=Model([encoder_input,decoder_input],decoder_output)
        model.compile(optimizer='adam',loss="categorical_crossentropy",metrics=['accuracy'])

it gives error when training due to the shape of input. Please help in re shaping my data as current shape is

encoder Data shape: (50, 1966, 7059) decoder Data shape: (50, 69, 1183) decoder target shape: (50, 69, 1183)

    Epoch 1/35
    WARNING:tensorflow:Model was constructed with shape (None, 1966) for input Tensor("input_37:0", shape=(None, 1966), dtype=float32), but it was called on an input with incompatible shape (None, 1966, 7059).
    WARNING:tensorflow:Model was constructed with shape (None, 69) for input Tensor("input_38:0", shape=(None, 69), dtype=float32), but it was called on an input with incompatible shape (None, 69, 1183).
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-71-d02252f12e7f> in <module>()
          1 model.fit([encoder_input_data, decoder_input_data], decoder_target_data,
          2           batch_size=16,
    ----> 3           epochs=35)
        ValueError: Input 0 of layer lstm_35 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1966, 7059, 256]

This is the summary of model


Solution

  • I have tried to replicate your issue and was able to fit the model successfully, you can follow the below code which is the same architecture as yours, there were some minor issues with shapes of the Embedding layer, I have included the weights for the embedding layer using Glove embedding, also mentioned the details for the embedding matrix below.

    embedding_layer = Embedding(num_words, EMBEDDING_SIZE, weights=[embedding_matrix], input_length=max_input_len)
    encoder_inputs_placeholder = Input(shape=(max_encoder_seq_length,))
    x = embedding_layer(encoder_inputs_placeholder)
    encoder = LSTM(LSTM_NODES, return_state=True)
    
    encoder_outputs, h, c = encoder(x)
    encoder_states = [h, c]
    decoder_inputs_placeholder = Input(shape=(max_decoder_seq_length,))
    
    decoder_embedding = Embedding(num_decoder_tokens, LSTM_NODES)
    decoder_inputs_x = decoder_embedding(decoder_inputs_placeholder)
    
    decoder_lstm = LSTM(LSTM_NODES, return_sequences=True, return_state=True)
    decoder_outputs, _, _ = decoder_lstm(decoder_inputs_x, initial_state=encoder_states)
    decoder_dense = Dense(num_decoder_tokens, activation='softmax')
    decoder_outputs = decoder_dense(decoder_outputs)
    model = Model([encoder_inputs_placeholder,
      decoder_inputs_placeholder], decoder_outputs)
    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    

    For embedding matrix:

    MAX_NUM_WORDS = 10000
    EMBEDDING_SIZE = 100 # you can choose 200, 300 dimensions also, depending on the embedding file you use.
    embeddings_dictionary = dict()
    
    glove_file = open(r'/content/drive/My Drive/datasets/glove.6B.100d.txt', encoding="utf8")
    
    for line in glove_file:
        records = line.split()
        word = records[0]
        vector_dimensions = asarray(records[1:], dtype='float32')
        embeddings_dictionary[word] = vector_dimensions
    glove_file.close()
    
    num_words = min(MAX_NUM_WORDS, len(word2idx_inputs) + 1)
    embedding_matrix = zeros((num_words, EMBEDDING_SIZE))
    for word, index in word2idx_inputs.items():
        embedding_vector = embeddings_dictionary.get(word)
        if embedding_vector is not None:
            embedding_matrix[index] = embedding_vector
    

    Model Summary:

    Model: "model_2"
    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_5 (InputLayer)            (None, 16)           0                                            
    __________________________________________________________________________________________________
    input_6 (InputLayer)            (None, 59)           0                                            
    __________________________________________________________________________________________________
    embedding_5 (Embedding)         (None, 16, 100)      1000000     input_5[0][0]                    
    __________________________________________________________________________________________________
    embedding_6 (Embedding)         (None, 59, 64)       5824        input_6[0][0]                    
    __________________________________________________________________________________________________
    lstm_4 (LSTM)                   [(None, 64), (None,  42240       embedding_5[0][0]                
    __________________________________________________________________________________________________
    lstm_5 (LSTM)                   [(None, 59, 64), (No 33024       embedding_6[0][0]                
                                                                     lstm_4[0][1]                     
                                                                     lstm_4[0][2]                     
    __________________________________________________________________________________________________
    dense_2 (Dense)                 (None, 59, 91)       5915        lstm_5[0][0]                     
    ==================================================================================================
    Total params: 1,087,003
    Trainable params: 1,087,003
    Non-trainable params: 0
    

    enter image description here

    Hope this resolves your issue, Happy Learning!