I am exporting my tensorflow model to TFLite. I want to be able to run the TFLite model on mobile device GPU with different input shapes.
For that, I was using the following code:
# tf model class
class MyModel(tf.keras.models.Model):
...
# Util functions
def build_graph(model, input_shape):
x = tf.keras.layers.Input(shape=input_shape)
model = tf.keras.models.Model(inputs=x, outputs=model(x))
model.compile(optimizer='adam', loss='binary_crossentropy')
return model
def save_tflite_model(output_model_path, tflite_model):
with open(output_model_path, 'wb') as f:
f.write(tflite_model)
def convert_model_from_concrete(model_path, output_model_path, input_shape=(1, None, None, 3)):
model = tf.saved_model.load(model_path)
concrete_func = model.signatures[
tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY]
concrete_func.inputs[0].set_shape(input_shape)
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
converter.experimental_new_converter = True
converter.target_spec.supported_ops = [
tf.lite.OpsSet.TFLITE_BUILTINS,
tf.lite.OpsSet.SELECT_TF_OPS
]
tflite_model = converter.convert()
print(tf.lite.experimental.Analyzer.analyze(model_content=tflite_model, gpu_compatibility=True))
save_tflite_model(output_model_path, tflite_model)
#Code for exporting my model to TFLite
model = MyModel()
temp_input_shape = (256, 256, 3)
model = build_graph(model, temp_input_shape)
model.save("my_model_256")
convert_model_from_concrete("my_model_256","my_model_256.tflite")
Everything works fine, and the model is exported. I can then locally load the model and run it with any valid shape:
interpreter = tf.lite.Interpreter("my_model_256.tflite")
custom_shape = [1, 512, 512, 3]
interpreter.resize_tensor_input(interpreter.get_input_details()[0]['index'], custom_shape)
interpreter.allocate_tensors()
input = numpy.random.rand(*custom_shape).astype(np.float32)
input_details = interpreter.get_input_details()
interpreter.set_tensor(input_details[0]['index'], input)
interpreter.invoke()
Is that the proper way of exporting tensorflow models to TFLite and then running them with different shapes? Because here: https://github.com/tensorflow/tensorflow/issues/41807 it is suggested to export the TF model with dynamic input size.
I tried the dynamic input size approach too:
#Code for exporting my model to TFLite with dynamic input shape
model = MyModel()
temp_input_shape = (None, None, 3)
model = build_graph(model, temp_input_shape)
model.save("my_model_dynamic")
convert_model_from_concrete("my_model_dynamic","my_model_dynamic.tflite")
and in the above python version again everything works well.
However when I use the TFLite benchmark model tool (https://www.tensorflow.org/lite/performance/measurement) with the gpu delegate and custom input shape:
--use_gpu=true --input_layer=input_8 --input_layer_shape=1, 512, 512, 3
the model created with the dynamic input size approach fails with the following error:
INFO: STARTING!
INFO: Log parameter values verbosely: [0]
INFO: Min num runs: [1]
INFO: Num threads: [8]
INFO: Graph: [./my_model_dynamic.tflite]
INFO: Input layers: [input_1]INFO: Input shapes: [1, 512, 512, 3]
INFO: #threads used for CPU inference: [8]
INFO: Use gpu: [1]
INFO: Loaded model ./my_model_dynamic.tflite
INFO: Initialized TensorFlow Lite runtime.
INFO: Created TensorFlow Lite delegate for GPU.
INFO: GPU delegate created.
VERBOSE: Replacing 343 out of 343 node(s) with delegate (TfLiteGpuDelegateV2) node, yielding 1 partitions for the whole graph.
ERROR: Failed to allocate device memory (clCreateSubBuffer): Invalid buffer size
ERROR: Falling back to OpenGL
INFO: Initialized OpenGL-based API.
ERROR: TfLiteGpuDelegate Init: Shapes are not equal
INFO: Created 0 GPU delegate kernels.
ERROR: TfLiteGpuDelegate Prepare: delegate is not initialized
ERROR: Node number 343 (TfLiteGpuDelegateV2) failed to prepare.
ERROR: Restored original execution plan after delegate application failure.
ERROR: Failed to apply GPU delegate
The model created with the first approach does not give any error, and seems to execute well with different valid shapes. Is anyone able to explain which approach is actually the supported one? I managed to modify the TFLite benchmark model tool code so that it works also with the dynamic input size approach. However I am not sure if these modifications are valid.
I opened an issue on github, where my problem was resolved.
It was a bug in the TFLite benchmark tool. It was fixed with this commit.