I am currently following the TensorFlow guide/tutorial on seq2seq NMT models (https://www.tensorflow.org/text/tutorials/nmt_with_attention) using a Jupyter Notebook. Upon running the following code,
# Setup the loop variables.
next_token, done, state = decoder.get_initial_state(ex_context)
tokens = []
for n in range(10):
# Run one step.
next_token, done, state = decoder.get_next_token(
ex_context, next_token, done, state, temperature=1.0)
# Add the token to the output.
tokens.append(next_token)
# Stack all the tokens together.
tokens = tf.concat(tokens, axis=-1) # (batch, t)
# Convert the tokens back to a a string
result = decoder.tokens_to_text(tokens)
result[:3].numpy()
I receive an InvalidArgument Error as follows:
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
Cell In[31], line 2
1 # Setup the loop variables.
----> 2 next_token, done, state = decoder.get_initial_state(ex_context)
3 tokens = []
5 for n in range(10):
6 # Run one step.
Cell In[28], line 8
6 embedded = self.embedding(start_tokens)
7 print(embedded)
----> 8 return start_tokens, done, self.rnn.get_initial_state(embedded)[0]
File ~/Library/Python/3.11/lib/python/site-packages/keras/src/layers/rnn/rnn.py:309, in RNN.get_initial_state(self, batch_size)
307 get_initial_state_fn = getattr(self.cell, "get_initial_state", None)
308 if get_initial_state_fn:
--> 309 init_state = get_initial_state_fn(batch_size=batch_size)
310 else:
311 return [
312 ops.zeros((batch_size, d), dtype=self.cell.compute_dtype)
313 for d in self.state_size
314 ]
File ~/Library/Python/3.11/lib/python/site-packages/keras/src/layers/rnn/gru.py:326, in GRUCell.get_initial_state(self, batch_size)
324 def get_initial_state(self, batch_size=None):
325 return [
--> 326 ops.zeros((batch_size, self.state_size), dtype=self.compute_dtype)
327 ]
File ~/Library/Python/3.11/lib/python/site-packages/keras/src/ops/numpy.py:5968, in zeros(shape, dtype)
5957 @keras_export(["keras.ops.zeros", "keras.ops.numpy.zeros"])
5958 def zeros(shape, dtype=None):
5959 """Return a new tensor of given shape and type, filled with zeros.
5960
5961 Args:
(...)
5966 Tensor of zeros with the given shape and dtype.
5967 """
-> 5968 return backend.numpy.zeros(shape, dtype=dtype)
File ~/Library/Python/3.11/lib/python/site-packages/keras/src/backend/tensorflow/numpy.py:619, in zeros(shape, dtype)
--> 617 return tf.zeros(shape, dtype=dtype)
File ~/Library/Python/3.11/lib/python/site-packages/tensorflow/python/util/traceback_utils.py:153, in filter_traceback.<locals>.error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
File ~/Library/Python/3.11/lib/python/site-packages/tensorflow/python/framework/ops.py:5983, in raise_from_not_ok_status(e, name)
5981 def raise_from_not_ok_status(e, name) -> NoReturn:
5982 e.message += (" name: " + str(name if name is not None else ""))
-> 5983 raise core._status_to_exception(e) from None
InvalidArgumentError: {{function_node __wrapped__Pack_N_2_device_/job:localhost/replica:0/task:0/device:CPU:0}} Shapes of all inputs must match: values[0].shape = [64,1,256] != values[1].shape = [] [Op:Pack] name:
Any ideas? I'm pretty sure I'm following the guide to the letter.
Turns out my only problem was versioning - use python 3.10 and tensorflow 2.11 and tensorflow-text 2.11. Creating a virtual environment with pyenv seemed to solve the problem. The wheels for the tensorflow and tensorflow-text packages can be found on PyPi, since pip didn't have those versions.
The other packages I used were:
einops==0.6.0
matplotlib==3.6.1
numpy==1.23.3