[SOLVED] Load Tensorflow Lite models in python

Load Tensorflow Lite models in python

I'm working on a TinyML project using Tensorflow Lite with both quantized and float models. In my pipeline, I train my model with the tf.keras API and then convert the model to a TFLite model. Finally, I quantize the TFLite model to int8.
I can save and load the "normal" tensorflow model with the API model.save and tf.keras.model.load_model

Is it possible to do the same with the converted TFLite models? Going through the quantization process every time is quite time-consuming.

Solution

You can use tflite interpreter to get inference from TFLite models directly in notebook.

Here is an example of a model for image classification. Let's say we have a tflite model as:

tflite_model_file = 'converted_model.tflite'

Then we can load and test it like this:

# Load TFLite model and allocate tensors.
with open(tflite_model_file, 'rb') as fid:
    tflite_model = fid.read()
    
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()

input_index = interpreter.get_input_details()[0]["index"]
output_index = interpreter.get_output_details()[0]["index"]

# Gather results for the randomly sampled test images
predictions = []

test_labels, test_imgs = [], []
for img, label in tqdm(test_batches.take(10)):
    interpreter.set_tensor(input_index, img)
    interpreter.invoke()
    predictions.append(interpreter.get_tensor(output_index))
    
    test_labels.append(label.numpy()[0])
    test_imgs.append(img)

Note that you can just inference from the tflite models. You are not able to make changes in architecture and layers, like reloading Keras models. If you want to change the architecture, you should save the Keras model, and test it until you get satisfactory results, then convert it to tflite.