After changing laptops, the training of my transformer (a simple translator) fails at the end of the first epoch.
The code is pretty long but I use model.fit() and overridden train_step().
h = model.fit( ds.ds_tra, epochs=epochs, validation_data=ds.ds_val )
The error is :
15/15 ━━━━━━━━━━━━━━━━━━━━ 0s 2s/step - masked_acc: 0.2338 - loss: 7.9628Traceback (most recent call last):
File "C:\_P\dev\ai\AI1\tfe\translators\seq2seq_11\seq2seq_tst_train1.py", line 113, in <module>
h = model.fit(
^^^^^^^^^^
File "C:\Users\u2gil\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\utils\traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\u2gil\AppData\Local\Programs\Python\Python311\Lib\site-packages\keras\src\utils\progbar.py", line 92, in update
self._values[k] = [v * value_base, value_base]
~~^~~~~~~~~~~~
TypeError: unsupported operand type(s) for *: 'dict' and 'int'
The strange thing is that when I execute exactly the same code on my previous Windows10 laptop , it works fine.
The config of my new PC is :
core ultra 7 on Windows 11
Python 3.11.9
keras 3.6.0
numpy 1.26.4
tensorflow 2.17.0
Follow the instructions on the Keras main page to downgrade your Keras version to 2.
The progbar.py file expects the returned metric values to be in a tuple format such as ("metric name", metric value).
However, in Keras 3, it seems that at the end of each epoch, the framework returns two separate tuples: one for ("loss", loss value) and another for ("compile metrics", {dictionary of all metrics}).
This behavior does not occur during intermediate calls; it is specific to the end of an epoch.
I resolved the issue by downgrading Keras to version 2 and ensuring TensorFlow used the installed legacy version, just as mentioned on the main page.