python keras nlp attention-model sequence-to-sequence

Concatenate layer shape error in sequence2sequence model with Keras attention

I'm trying to implement a simple word-level sequence-to-sequence model with Keras in Colab. I'm using the Keras Attention layer. Here is the definition of the model:

embedding_size=200
UNITS=128

encoder_inputs = Input(shape=(None,), name="encoder_inputs")

encoder_embs=Embedding(num_encoder_tokens, embedding_size, name="encoder_embs")(encoder_inputs)

#encoder lstm
encoder = LSTM(UNITS, return_state=True, name="encoder_LSTM") #(encoder_embs)
encoder_outputs, state_h, state_c = encoder(encoder_embs)

encoder_states = [state_h, state_c]

decoder_inputs = Input(shape=(None,), name="decoder_inputs")
decoder_embs = Embedding(num_decoder_tokens, embedding_size, name="decoder_embs")(decoder_inputs)

#decoder lstm
decoder_lstm = LSTM(UNITS, return_sequences=True, return_state=True, name="decoder_LSTM")
decoder_outputs, _, _ = decoder_lstm(decoder_embs, initial_state=encoder_states)

attention=Attention(name="attention_layer")
attention_out=attention([encoder_outputs, decoder_outputs])

decoder_concatenate=Concatenate(axis=-1, name="concat_layer")([decoder_outputs, attention_out])
decoder_outputs = TimeDistributed(Dense(units=num_decoder_tokens, 
                                  activation='softmax', name="decoder_denseoutput"))(decoder_concatenate)

model=Model([encoder_inputs, decoder_inputs], decoder_outputs, name="s2s_model")
model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['accuracy'])

model.summary()

Model compiling is fine, no problems whatsoever. The encoder and decoder input and output shapes are:

Encoder training input shape:  (4000, 21)
Decoder training input shape:  (4000, 12)
Decoder training target shape:  (4000, 12, 3106)
--
Encoder test input shape:  (385, 21)

This is the model.fit code:

model.fit([encoder_training_input, decoder_training_input], decoder_training_target,
      epochs=100,
      batch_size=32,
      validation_split=0.2,)

When I run the fit phase, I get this error from the Concatenate layer:

ValueError: Dimension 1 in both shapes must be equal, but are 12 and 32. 
Shapes are [32,12] and [32,32]. for '{{node s2s_model/concat_layer/concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](s2s_model/decoder_LSTM/PartitionedCall:1,
s2s_model/attention_layer/MatMul_1, s2s_model/concat_layer/concat/axis)' with input shapes: [32,12,128], [32,32,128], [] and with computed input tensors: input[2] = <2>.

So, the first 32 are batch_size, 128 are output shape from decoder_outputs and attention_out, 12 is the number of tokens of decoder inputs. I can't understand how to solve this error, I can't change the number of input tokens I think, any suggestions for me?

Solution

Solved this thanks to @Majitsima. I swapped the inputs to the Attention layer, so instead of

attention=Attention(name="attention_layer")
attention_out=attention([encoder_outputs, decoder_outputs])

the input is

attention=Attention(name="attention_layer")
attention_out=attention([decoder_outputs, encoder_outputs])

with

decoder_concatenate=Concatenate(axis=-1, name="concat_layer")([decoder_outputs, attention_out])

Everything seems to work now, so thank you again @Majitsima, hope this can help!