pythonnumpytensorflowmachine-learningkeras

numpy eager execution problem after loading a CNN model


I want to save and load a CNN model to further training. I have developed a model and saved it as .h5 file. There is no problem when creating, training, and saving at first run.

The problem exists when loading an existing .h5 model and attempting to train it. The following code describes the implementation and issue.

import os.path
import tensorflow as tf

# Enable eager execution
tf.compat.v1.enable_eager_execution()

... # removed for readibility

def train_model(model_name:str, model_handler: TensorModel, visualiser: Visualiser, logger: Logger, x_train, y_train, x_test, y_test, batch_size) -> tuple:
    logger.info(f"Eager enabled: {tf.executing_eagerly()}")

    # Check to see if the model has already been trained
    if USE_EXISTING_MODELS and os.path.exists(f"models/{model_name}.h5"):
        model = model = tf.keras.models.load_model(f"models/{model_name}.h5")
        (x_train, y_train), (x_test, y_test) = model_handler.load_data()
    else:
        if model_name == "base_model":
            model = model_handler.create_cnn()
        else:
            model = model_handler.create_cnn(batch_normalisation=True)

    history = model.fit(x_train, y_train, epochs=NUM_EPOCHS, batch_size=batch_size, validation_data=(x_test, y_test))
    test_loss, test_acc = model.evaluate(x_test, y_test)
    model.save(f"models/{model_name}.h5")
    
    logger.info(f"Model accuracy: {test_acc * 100:.2f}%")
    visualiser.plot_training_history(history, model_name)
    return (model, model_name), (test_acc)

The following is the error produced.

File "/home/arief/development/Machine_Learning/Machine-Learning-Real-Time-Object-Classification/Train.py", line 40, in train_model
    history = model.fit(x_train, y_train, epochs=NUM_EPOCHS, batch_size=batch_size, validation_data=(x_test, y_test))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/arief/development/Machine_Learning/Machine-Learning-Real-Time-Object-Classification/.venv/lib/python3.12/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/arief/development/Machine_Learning/Machine-Learning-Real-Time-Object-Classification/.venv/lib/python3.12/site-packages/keras/src/backend/tensorflow/core.py", line 155, in convert_to_numpy
    return np.array(x)
           ^^^^^^^^^^^
NotImplementedError: numpy() is only available when eager execution is enabled.

The logging shows that eager execution is enabled with the following:

2025-02-21 13:46:42,505 - INFO - Eager enabled: True

What am I missing?


Solution

  • After some rubber-ducking, I realised that the model is loaded but not compiled or built. After implementing the compilation and building, the model can successfully be trained again.

    The following are the necessary modifications.

    # Check to see if the model has already been trained
    if USE_EXISTING_MODELS and os.path.exists(f"models/{model_name}.h5"):
            model = model = tf.keras.models.load_model(f"models/{model_name}.h5")
            model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
            model.build()
            model.summary()