I'm jumping back into a project I stopped working on last year (mostly). I had already encountered this issue, and this answer solved it back then. I am currently running basically the exact script from that answer, but now during training, I'm back to getting all 0s for validation losses, both KL and reconstruction.
history = var_autoencoder.fit( x_train, x_train, epochs=1000, shuffle=True, validation_data=(x_test, x_test),
callbacks=kcb.EarlyStopping(monitor="val_loss", patience=30, restore_best_weights=True) )
Epoch 1/1000
107/107 [==============================] - 3s 30ms/step - loss: 118.1165 - reconstruction_loss: 117.0647 - kl_loss: 1.0518 - val_loss: 0.0000e+00 - val_reconstruction_loss: 0.0000e+00 - val_kl_loss: 0.0000e+00
Epoch 2/1000
107/107 [==============================] - 3s 30ms/step - loss: 104.4190 - reconstruction_loss: 103.7018 - kl_loss: 0.7172 - val_loss: 0.0000e+00 - val_reconstruction_loss: 0.0000e+00 - val_kl_loss: 0.0000e+00
Epoch 3/1000
107/107 [==============================] - 3s 30ms/step - loss: 103.2905 - reconstruction_loss: 102.5077 - kl_loss: 0.7828 - val_loss: 0.0000e+00 - val_reconstruction_loss: 0.0000e+00 - val_kl_loss: 0.0000e+00
Epoch 4/1000
107/107 [==============================] - 3s 31ms/step - loss: 101.7333 - reconstruction_loss: 100.8803 - kl_loss: 0.8530 - val_loss: 0.0000e+00 - val_reconstruction_loss: 0.0000e+00 - val_kl_loss: 0.0000e+00
Sample dataset and full script can be found here
In this example from the Tensorflow pages they also updated the metrics in the test_step
function:
# Update the metrics.
for metric in self.metrics:
if metric.name != "loss":
metric.update_state(y, y_pred)
Without the update, the evaluation metrics will stay at its initialized value zero.