I have an embedding layer and a GRU layer in Keras as following:
embedding_layer = tf.keras.layers.Embedding(5000, 256, mask_zero=True)
gru_layer = tf.keras.layers.GRU(256, return_sequences=True, recurrent_initializer='glorot_uniform')
When I give the following inputs
A1 = np.random.random((64, 29))
A2 = embedding_layer(A1)
A3 = gru_layer(A2)
print(A1.shape, A2.shape, A3.shape)
everything is fine and I get
(64, 29) (64, 29, 256) (64, 29, 256)
But when I do
y2 = tf.keras.Input(shape=(64,29))
print(y2.shape)
y3 = embedding_layer(y2)
print(y3.shape)
y4 = gru_layer(y3)
print(y4.shape)
The first two print statements are fine and I get
(None, 64, 29)
(None, 64, 29, 256)
but then I get the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[125], line 5
3 y3 = embedding_layer(y2)
4 print(y3.shape)
----> 5 y4 = gru_layer(y3)
6 print(y4.shape)
File /opt/conda/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:123, in filter_traceback.<locals>.error_handler(*args, **kwargs)
120 filtered_tb = _process_traceback_frames(e.__traceback__)
121 # To get the full stack trace, call:
122 # `keras.config.disable_traceback_filtering()`
--> 123 raise e.with_traceback(filtered_tb) from None
124 finally:
125 del filtered_tb
File /opt/conda/lib/python3.10/site-packages/keras/src/layers/input_spec.py:186, in assert_input_compatibility(input_spec, inputs, layer_name)
184 if spec.ndim is not None and not spec.allow_last_axis_squeeze:
185 if ndim != spec.ndim:
--> 186 raise ValueError(
187 f'Input {input_index} of layer "{layer_name}" '
188 "is incompatible with the layer: "
189 f"expected ndim={spec.ndim}, found ndim={ndim}. "
190 f"Full shape received: {shape}"
191 )
192 if spec.max_ndim is not None:
193 if ndim is not None and ndim > spec.max_ndim:
ValueError: Input 0 of layer "gru_17" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 64, 29, 256)
Why does Keras input behaves differently compared to a resular tensor and I get this error? Also why is the shape of these tensors printed like (None, 64, 29)
as opposed to (64, 29)
?
keras.Input expects the shape as the first argument and the batch size as the second argument:
keras.Input(
shape=None,
batch_size=None,
...
)
shape: A shape tuple (tuple of integers or None objects), not including the batch size.
So only initialize it with keras.Input(shape=(29,))
.