python tensorflow keras tensorflow-lite quantization-aware-training

ValueError: Quantizing a tf.keras Model inside another tf.keras Model is not supported

I've just got started with Keras/Tensorflow and I am trying to retrain and quantize to int8 a MobileNetV2 but I am getting this error:

ValueError: Quantizing a tf.keras Model inside another tf.keras Model is not supported.

I was following this guide to get around the quantization steps, but I am not exactly sure what exactly I am doing different.

IMG_SHAPE = (224, 224, 3)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                                  include_top=False, 
                                                  weights='imagenet')
base_model.trainable = False
model = tf.keras.Sequential([
  base_model,
  tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'),
  tf.keras.layers.Dropout(0.5),
  tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(units=2, activation='softmax')
])

quantize_model = tfmot.quantization.keras.quantize_model
q_aware_model = quantize_model(model)

Stack Trace:

ValueError                                Traceback (most recent call last)

<ipython-input-34-b724ad4872a5> in <module>()
      9 
     10 quantize_model = tfmot.quantization.keras.quantize_model
---> 11 q_aware_model = quantize_model(model)

4 frames

/usr/local/lib/python3.7/dist-packages/tensorflow_model_optimization/python/core/quantization/keras/quantize.py in _add_quant_wrapper(layer)
    217     if isinstance(layer, tf.keras.Model):
    218       raise ValueError(
--> 219           'Quantizing a tf.keras Model inside another tf.keras Model is not supported.'
    220       )
    221

Solution

In this case your base_model behaves as if it is a layer. In order to expand it, you need to use Functional API, rather than Sequential API:

IMG_SHAPE = (224, 224, 3)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                                  include_top=False, 
                                                  weights='imagenet')
base_model.trainable = False
x = tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu')(base_model.output)
x = tf.keras.layers.Dropout(0.5)(x)
x = tf.keras.layers.MaxPool2D(pool_size=(2, 2))(x)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dense(units=2, activation='softmax')(x)

model = tf.keras.Model(base_model.input, x)
model.summary()

Notice that model summary shows all of the layers including the base_model's. Then you can apply:

quantize_model = tfmot.quantization.keras.quantize_model
q_aware_model = quantize_model(model)