I want to add an tf.keras.layers.MultiHeadAttention
inside the two layers of neural network. However, I am getting IndexError:
The detailed code are as follow
x1 = Dense(58, activation='relu')(x1)
x1 = Dropout(0.1)(x1)
print(x1.shape)
attention = tf.keras.layers.MultiHeadAttention(num_heads=2, key_dim=58,
dropout=0.1,output_shape=x1.shape)(x1,x1)
x1 = Dropout(0.2)(attention)
x1 = Dense(59, activation='relu')(x1)
output = Dense(1, activation='linear')(x1)`
model = tf.keras.models.Model(inputs=input1, outputs=output)
In the above code I am getting following error
IndexError: Exception encountered when calling layer 'softmax' (type Softmax).
tuple index out of range
Call arguments received by layer 'softmax' (type Softmax):
• inputs=tf.Tensor(shape=(None, 2), dtype=float32)
• mask=None
Note that
`x1.shape`= (None, 58)
The problem is solved now. MultiHeadAttention layer in TensorFlow expects a 3D input tensor. Therefor to introduce an attention block into normal neural network, there is need to set inputs and outputs of that block accordingly. So the updated code is as follow
x1 = Dense(58, activation='relu')(x1)
x1 = Dropout(0.1)(x1)
x1 = tf.expand_dims(x1, axis=1) # here we need to expand dimension
print(x1.shape)
attention = tf.keras.layers.MultiHeadAttention(num_heads=3, key_dim=x1.shape[2], dropout=0.2)(x1, x1)
x1 = Dropout(0.2)(attention)
x1 = tf.keras.layers.LayerNormalization()(x1)
x1 = tf.squeeze(x1, axis=1) # set dimension here again
x1 = Dense(10, activation='relu')(x1)
output = Dense(1, activation='linear')(x1)