Increasing the size of the model

Recently I trained two MLP model and saved weights for future work. I develop one module for loading model and use these model in another module.

Load model module contain this code to load models:

def creat_model_extractor(model_path, feature_count):
    """
    This function create model and set weights
    :param model_path: address of weights files
    :param feature_count: Number of nodes in input layer
    """
    try:
        tf.keras.backend.clear_session()
        node_list = [1024, 512, 256, 128, 64, 32]

        model = Sequential()
        model.add(Input(shape=(feature_count,)))

        for node in node_list:
            model.add(Dense(node, activation='relu'))
            model.add(Dropout(0.2))
            model.add(LayerNormalization())

        model.add(Dense(16, activation='relu'))
        model.add(LayerNormalization())
        model.add(Dense(1, activation='sigmoid'))

        @tf.function
        def inference_step(inputs):
            return tf.stop_gradient(model(inputs, training=False))
 
        model.inference_step = inference_step

        model.load_weights(model_path)
        model.trainable = False
        for layer in model.layers:
            layer.trainable = False
    except Exception as error:
        logger.warning(error, exc_info=True)
        return None

    return model

And this is predict function

SMALL_MODEL = creat_model_extractor(MODEL_PATH_SMALL, small_blocks_count)

(SMALL_MODEL.inference_step(small_blocks_normal) > 0.5).numpy().astype(int)

Problem: After predict label 'SMALL_MODEL' size change. It become bigger and And after a while, the RAM fills up. What should I do to prevent RAM from filling up?

This problem happen even in one module

Solution

Every time you call model.inference_step(), TensorFlow creates a new computation graph, because your @tf.function is dynamically bound inside your model object.

TensorFlow is trying to trace and cache the @tf.function, but it can't re-use the existing trace properly, because model.inference_step is reattached dynamically and behaves non-standardly.

You should not dynamically attach inference_step inside the function. Instead, move the @tf.function outside the model and use the model directly for prediction.

@tf.function
def inference(model, inputs):
    return tf.stop_gradient(model(inputs, training=False))

and call:

SMALL_MODEL = creat_model_extractor(MODEL_PATH_SMALL, small_blocks_count)

predictions = inference(SMALL_MODEL, small_blocks_normal)
labels = (predictions > 0.5).numpy().astype(int)