tensorflowtensorflow-litebatch-normalizationquantizationquantization-aware-training

Batch Normalization Quantize Tensorflow 1.x does not have MinMax information


A layer (....) which is an input to the Conv operator producing the output array model/re_lu_1/Relu, is lacking min/max data, which is necessary for quantization. If accuracy matters, either target a non-quantized output format, or run quantized training with your model from a floating point checkpoint to change the input graph to contain min/max information. If you don't care about accuracy, you can pass --default_ranges_min= and --default_ranges_max= for easy experimentation.


Solution

  • For tensorflow 1.x, if you want to quantize, you have to place it with fake quantization nodes to activate the quantization of the model. There are 3 phases of quantization:

    1. Training part: load your model to graph => create training graph by contrib => train and store weights ckpt
    2. Eval part: load your model to graph without weights => create eval graph => restore graph => export to frozen model
    3. Toco/tflite convert frozen model to quantized model

    However, the most important factor is the configuration of batch_normalization in the model. After trying multiple configuration, the best one is using batch_normalization without fused option from tensorflow.keras.layers. The reason is because Tensorflow want to avoid the folding result to be quantized. Therefore, activation behind batchnorm wont work. Details in [here][1]

    In short, this layer should be attached only under tensorflow.keras.layers.Conv2D with parsed activation param, which is Relu/Relu6/Identity

    If you conduct the above process: Conv2d=>Activation=>BatchNorm

    the layer will not yield errors does not have MinMax information