I have an embedding layer and a GRU layer in Keras as following:
embedding_layer = tf.keras.layers.Embedding(5000, 256, mask_zero=True)
gru_layer = tf.keras.layers.GRU(256, return_sequences=True, recurrent_initializer='glorot_uniform')
When I give the following inputs
A1 = np.random.random((64, 29))
A2 = embedding_layer(A1)
A3 = gru_layer(A2)
print(A1.shape, A2.shape, A3.shape)
everything is fine and I get
(64, 29) (64, 29, 256) (64, 29, 256)
But when I do
y2 = tf.keras.Input(shape=(64,29))
print(y2.shape)
y3 = embedding_layer(y2)
print(y3.shape)
y4 = gru_layer(y3)
print(y4.shape)
The first two print statements are fine and I get
(None, 64, 29)
(None, 64, 29, 256)
but then I get the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[125], line 5
3 y3 = embedding_layer(y2)
4 print(y3.shape)
----> 5 y4 = gru_layer(y3)
6 print(y4.shape)
File /opt/conda/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py:123, in filter_traceback.<locals>.error_handler(*args, **kwargs)
120 filtered_tb = _process_traceback_frames(e.__traceback__)
121 # To get the full stack trace, call:
122 # `keras.config.disable_traceback_filtering()`
--> 123 raise e.with_traceback(filtered_tb) from None
124 finally:
125 del filtered_tb
File /opt/conda/lib/python3.10/site-packages/keras/src/layers/input_spec.py:186, in assert_input_compatibility(input_spec, inputs, layer_name)
184 if spec.ndim is not None and not spec.allow_last_axis_squeeze:
185 if ndim != spec.ndim:
--> 186 raise ValueError(
187 f'Input {input_index} of layer "{layer_name}" '
188 "is incompatible with the layer: "
189 f"expected ndim={spec.ndim}, found ndim={ndim}. "
190 f"Full shape received: {shape}"
191 )
192 if spec.max_ndim is not None:
193 if ndim is not None and ndim > spec.max_ndim:
ValueError: Input 0 of layer "gru_17" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 64, 29, 256)
Why does Keras input behaves differently compared to a resular tensor and I get this error? Also why is the shape of these tensors printed like (None, 64, 29) as opposed to (64, 29)?
keras.Input expects the shape as the first argument and the batch size as the second argument:
So only initialize it with
keras.Input(shape=(29,)).