I am checking the option to run image segmentation using the pre-trained deeplab xception65_coco_voc_trainval model.
The frozen model size is ~161MB, after I convert it to tflite the size is ~160MB, and running this model on my PC cpu takes ~25 seconds.
Is that "expected" or there is something I can do better?
The conversion to tflite is as follow:
tflite_convert \
--graph_def_file="deeplabv3_pascal_trainval/frozen_inference_graph.pb" \
--output_file="deeplab_xception_pascal.tflite" \
--output_format=TFLITE \
--input_shape=1,513,513,3 \
--input_arrays="sub_7" \
--output_arrays="ArgMax" \
--inference_type=FLOAT \
--allow_custom_ops
Thanks!
According to https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md, xception65_coco_voc_trainval with 3 eval scales takes about 223 seconds. The frozen graph has a single eval scale, so ~25 seconds sounds about right to me.
To speed up inference for TfLite I would suggest using gpu delegate, but as you are running on a PC, you will need to find a smaller model. Maybe try one of the mobilenet based models? The edgetpu models will run in tflite without an edgetpu and should be quite fast, although these are trained on cityscapes.