Sorry for the very basic question (I'm new with Keras). I was wondering how Keras can calculate for each layer the number of parameters at an early stage (before fit) despite that model.summary shows that there are dimensions that still have None values at this stage. Are these values already determined in some way and if yes, why not show them in the summary?
I ask the question because I'm having a hard time figure out my "tensor shape bug" (I'm trying to determine the output dimensions of the the C5 block of my resnet50 model but I cannot see them in model.summary even if I see the number of parameters).
I give below an example based on C5_reduced layer in RetinaNet which is fed by C5 layer of Resnet50. The C5_reduced is
Conv2D(256,kernel_size=1,strides=1,pad=1)
Based on model.summary for this particular layer:
C5_reduced (Conv2D) (None, None, None, 256) 524544
I've made the guess that C5 is (None,1,1,2048) because 2048*256+256 = 524544 (I don't know how to confirm or infirm that hypothesis). So if it's already known, why not show it on summary? If dimensions 2 and 3 would have been different, the number of parameters would have been different too right?
If you pass exact input shape to your very first layer or input layer on your network, you will have the output that you want. For instance I used input layer here:
input_1 (InputLayer) [(None, 224, 224, 3)] 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928
Passed input as (224,224,3). 3 represents the depth here. Note that convolutional parameters' calculation differ from Dense layers' calculation.
If you do such following:
tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3))
You will see:
conv2d (Conv2D) ---> (None, 148, 148, 16)
Dimensions reduced to 148x148, in Keras padding is valid
by default. Also strides
is 1. Then the shape of output will be 148 x 148. (You can search for the formula.)
So then what are None values?
Edit:
tf.keras.layers.Input(shape = (None, None, 3)),
tf.keras.layers.Conv2D(16, (3,3), activation='relu')
Produces:
conv2d_21 (Conv2D) (None, None, None, 16) 448
Regarding to your question, how are the parameters calculated even we passed image height & width as None?
Convolution parameters calculated according to:
(filter_height * filter_width * input_image_channels + 1) * number_of_filters
When we put them into formula,
filter_height = 3
filter_width = 3
input_image_channel = 3
number_of_filters = 16
Parameters = (3 x 3 x 3 + 1) * 16 = 28 * 16 = 448
Notice, we only needed input_image's channel number which is 3, representing that it is an RGB image.
If you want to calculate the params for later convolutions, you need to consider that the number of filters from previous layer becomes the number of channels for current layer's channel.
That's how you can end up having None params rather than batch_size. Keras needs to know if your image is RGB or not in that case. Or you won't specify the dimensions while creating the model and can pass them while fitting the model with the dataset.