I want to save and load a CNN model to further training. I have developed a model and saved it as .h5
file. There is no problem when creating, training, and saving at first run.
The problem exists when loading an existing .h5
model and attempting to train it. The following code describes the implementation and issue.
import os.path
import tensorflow as tf
# Enable eager execution
tf.compat.v1.enable_eager_execution()
... # removed for readibility
def train_model(model_name:str, model_handler: TensorModel, visualiser: Visualiser, logger: Logger, x_train, y_train, x_test, y_test, batch_size) -> tuple:
logger.info(f"Eager enabled: {tf.executing_eagerly()}")
# Check to see if the model has already been trained
if USE_EXISTING_MODELS and os.path.exists(f"models/{model_name}.h5"):
model = model = tf.keras.models.load_model(f"models/{model_name}.h5")
(x_train, y_train), (x_test, y_test) = model_handler.load_data()
else:
if model_name == "base_model":
model = model_handler.create_cnn()
else:
model = model_handler.create_cnn(batch_normalisation=True)
history = model.fit(x_train, y_train, epochs=NUM_EPOCHS, batch_size=batch_size, validation_data=(x_test, y_test))
test_loss, test_acc = model.evaluate(x_test, y_test)
model.save(f"models/{model_name}.h5")
logger.info(f"Model accuracy: {test_acc * 100:.2f}%")
visualiser.plot_training_history(history, model_name)
return (model, model_name), (test_acc)
The following is the error produced.
File "/home/arief/development/Machine_Learning/Machine-Learning-Real-Time-Object-Classification/Train.py", line 40, in train_model
history = model.fit(x_train, y_train, epochs=NUM_EPOCHS, batch_size=batch_size, validation_data=(x_test, y_test))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/arief/development/Machine_Learning/Machine-Learning-Real-Time-Object-Classification/.venv/lib/python3.12/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/arief/development/Machine_Learning/Machine-Learning-Real-Time-Object-Classification/.venv/lib/python3.12/site-packages/keras/src/backend/tensorflow/core.py", line 155, in convert_to_numpy
return np.array(x)
^^^^^^^^^^^
NotImplementedError: numpy() is only available when eager execution is enabled.
The logging shows that eager execution
is enabled with the following:
2025-02-21 13:46:42,505 - INFO - Eager enabled: True
What am I missing?
After some rubber-ducking, I realised that the model is loaded but not compiled or built. After implementing the compilation and building, the model can successfully be trained again.
The following are the necessary modifications.
# Check to see if the model has already been trained
if USE_EXISTING_MODELS and os.path.exists(f"models/{model_name}.h5"):
model = model = tf.keras.models.load_model(f"models/{model_name}.h5")
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.build()
model.summary()