I am working on an image classification problem and was using 90% pre-trained Keras mobilenet v3 on ImageNet and remaining 10% layers are made trainable whilst applying dropout of 0.2. I was wondering how this was being handled in the backend.
MobileNetV3Small(input_shape=(IMG_HEIGHT, IMG_WIDTH, DEPTH),
alpha=1.0,
minimalistic=False,
include_top=False,
weights='imagenet',
input_tensor=None,
pooling='max',
dropout_rate=0.2)
If the layer is called with parameter training=False
, like when you predict, nothing will happen. Let's start with some input:
import tensorflow as tf
rate = 0.4
dropout = tf.keras.layers.Dropout(rate)
x = tf.cast(tf.reshape(tf.range(1, 10), (3, 3)), tf.float32)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]], dtype=float32)>
Now, let's call the dropout model while training:
dropout(x, training=True)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[ 0. , 3.3333333, 0. ],
[ 6.6666665, 8.333333 , 0. ],
[11.666666 , 13.333333 , 15. ]], dtype=float32)>
As you can see, all the remaining values are multiplied by 1/(1-p)
. Now let's call the network when training=False
:
dropout(x, training=False)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]], dtype=float32)>
Nothing happens.