Consider this tensorflow python code that loads a pretrained model:
import tensorflow as tf
conv_model = keras.applications.vgg16.VGG16(
weights='imagenet',
include_top=False)
conv_model.trainable=False
print("Number of trainable weights after freezing: ", len(conv_model.trainable_weights))
conv_model.trainable=True
print("Number of trainable weights after defreezing: ", len(conv_model.trainable_weights))
and I got printed
Number of trainable weights after freezing: 0
Number of trainable weights after defreezing: 26
However, if I do
conv_model.trainable=True
conv_model.summary()
I get:
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
and if I freeze I get 0 trainable paraemters.
Why there is this discrepancy between model.summary()
and the other method?
Length of the weights doesnt give the total parameters. You should use:
from keras.utils.layer_utils import count_params
np.sum([count_params(p) for p in conv_model.trainable_weights])
#14714688
instead of,
len(conv_model.trainable_weights)
Length gives the number of kernels and biases and each of them can be inspected by:
for p in conv_model.trainable_weights:
print (p.name, p.shape, np.cumprod(p.shape)[-1], count_params(p))
#outputs 26 conv layers shape params params
block1_conv1/kernel:0 (3, 3, 3, 64) 1728 1728
block1_conv1/bias:0 (64,) 64 64
block1_conv2/kernel:0 (3, 3, 64, 64) 36864 36864
...
block5_conv3/kernel:0 (3, 3, 512, 512) 2359296 2359296
block5_conv3/bias:0 (512,) 512 512