tensorflowkerastf.keras

Quantizing a `Serial` tendorflow model throws `Value Error`


I have the following Tensorflow model that I want to quantize:

model = Sequential([
    Input(shape=input_shape),
    LSTM(lstm_units_1, return_sequences=True),
    Dropout(dropout_rate),
    LSTM(lstm_units_2, return_sequences=False),
    Dropout(dropout_rate),
    Dense(4, activation='softmax')
])

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
model_checkpoint = ModelCheckpoint(model_path, monitor='val_loss', save_best_only=True, save_weights_only=False, mode='min')

history = model.fit(X_train, y_train, 
                    epochs=epochs, 
                    batch_size=batch_size, 
                    validation_split=0.2, 
                    callbacks=[early_stopping],
                    verbose=1)

model.save(model_path)

I am trying to perform the quantization like this:

annotated_model = tfmot.quantization.keras.quantize_annotate_model(model)

with tfmot.quantization.keras.quantize_scope():
    quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)

quant_aware_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

But I am receiving this error: ValueError: `to_annotate` can only be a `keras.Model` instance. Use the `quantize_annotate_layer` API to handle individual layers. You passed an instance of type: Sequential.

Trying to quantize each layer as the error suggest didn't work for me as well with another value error about LSTM layers not being accepted as inputs.

annotated_model = tf.keras.Sequential([
    tfmot.quantization.keras.quantize_annotate_layer(layer)
    for layer in model.layers
])

What is the correct way to quantize the particular model I am using here?


Solution

  • Rewrite your TensorFlow model in pure TensorFlow, or alternatively, use Keras. However, do not mix the two, as this can lead to numerous errors. I recommend using Keras for everything.

    Example in Keras

    
    import keras
    from keras import layers
    from keras import ops
    import numpy as np
    
    # Define Sequential model with 3 layers
    model = keras.Sequential(
        [
            layers.Dense(2, activation="relu", name="layer1"),
            layers.Dense(3, activation="relu", name="layer2"),
            layers.Dense(4, name="layer3"),
        ]
    )
    
    # Compile the model
    model.compile(
        optimizer='adam',
        loss='mean_squared_error',
        metrics=['accuracy']
    )
    
    # Example data for training
    x_train = np.random.random((100, 3)).astype('float32')
    y_train = np.random.random((100, 4)).astype('float32')
    
    # Train the model
    model.fit(x_train, y_train, epochs=10, batch_size=32)
    
    # Convert the model to TensorFlow Lite with post-training quantization to float16
    import tensorflow as tf
    
    # Convert the model to a TensorFlow Lite model
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.target_spec.supported_types = [tf.float16]
    tflite_model = converter.convert()
    
    # Save the quantized model
    with open("model_quantized_f16.tflite", "wb") as f:
        f.write(tflite_model)