kerastf.kerasgru

Why Keras Input behaves differently than a regular tensor


I have an embedding layer and a GRU layer in Keras as following:

embedding_layer = tf.keras.layers.Embedding(5000, 256, mask_zero=True)
gru_layer = tf.keras.layers.GRU(256, return_sequences=True, recurrent_initializer='glorot_uniform')

When I give the following inputs

A1 = np.random.random((64, 29))
A2 = embedding_layer(A1)
A3 = gru_layer(A2)
print(A1.shape, A2.shape, A3.shape)

everything is fine and I get

(64, 29) (64, 29, 256) (64, 29, 256)

But when I do

y2 = tf.keras.Input(shape=(64,29))
print(y2.shape)
y3 = embedding_layer(y2)
print(y3.shape)
y4 = gru_layer(y3)
print(y4.shape)

The first two print statements are fine and I get

(None, 64, 29)
(None, 64, 29, 256)

but then I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[125], line 5
      3 y3 = embedding_layer(y2)
      4 print(y3.shape)
----> 5 y4 = gru_layer(y3)
      6 print(y4.shape)

File /opt/conda/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:123, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    120     filtered_tb = _process_traceback_frames(e.__traceback__)
    121     # To get the full stack trace, call:
    122     # `keras.config.disable_traceback_filtering()`
--> 123     raise e.with_traceback(filtered_tb) from None
    124 finally:
    125     del filtered_tb

File /opt/conda/lib/python3.10/site-packages/keras/src/layers/input_spec.py:186, in assert_input_compatibility(input_spec, inputs, layer_name)
    184 if spec.ndim is not None and not spec.allow_last_axis_squeeze:
    185     if ndim != spec.ndim:
--> 186         raise ValueError(
    187             f'Input {input_index} of layer "{layer_name}" '
    188             "is incompatible with the layer: "
    189             f"expected ndim={spec.ndim}, found ndim={ndim}. "
    190             f"Full shape received: {shape}"
    191         )
    192 if spec.max_ndim is not None:
    193     if ndim is not None and ndim > spec.max_ndim:

ValueError: Input 0 of layer "gru_17" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 64, 29, 256)

Why does Keras input behaves differently compared to a resular tensor and I get this error? Also why is the shape of these tensors printed like (None, 64, 29) as opposed to (64, 29)?


Solution

  • keras.Input expects the shape as the first argument and the batch size as the second argument:

    keras.Input(
        shape=None,
        batch_size=None,
        ...
    )
    

    shape: A shape tuple (tuple of integers or None objects), not including the batch size.

    So only initialize it with keras.Input(shape=(29,)).