How Keras can calculate the number of parameters at early stage when there are still None dimensions?

Sorry for the very basic question (I'm new with Keras). I was wondering how Keras can calculate for each layer the number of parameters at an early stage (before fit) despite that model.summary shows that there are dimensions that still have None values at this stage. Are these values already determined in some way and if yes, why not show them in the summary?

I ask the question because I'm having a hard time figure out my "tensor shape bug" (I'm trying to determine the output dimensions of the the C5 block of my resnet50 model but I cannot see them in model.summary even if I see the number of parameters).

I give below an example based on C5_reduced layer in RetinaNet which is fed by C5 layer of Resnet50. The C5_reduced is

Conv2D(256,kernel_size=1,strides=1,pad=1)

Based on model.summary for this particular layer:

C5_reduced (Conv2D)    (None, None, None, 256)          524544

I've made the guess that C5 is (None,1,1,2048) because 2048*256+256 = 524544 (I don't know how to confirm or infirm that hypothesis). So if it's already known, why not show it on summary? If dimensions 2 and 3 would have been different, the number of parameters would have been different too right?

Solution

If you pass exact input shape to your very first layer or input layer on your network, you will have the output that you want. For instance I used input layer here:

input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928

Passed input as (224,224,3). 3 represents the depth here. Note that convolutional parameters' calculation differ from Dense layers' calculation.

If you do such following:

tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3))

You will see:

conv2d (Conv2D) ---> (None, 148, 148, 16)

Dimensions reduced to 148x148, in Keras padding is valid by default. Also strides is 1. Then the shape of output will be 148 x 148. (You can search for the formula.)

So then what are None values?

First None value is the batch size. In Keras first dimension is the batch size. You can pass them and make fixed, or you can determine them while fitting the model, or predicting.
In 2D convolution, the expected input is (batch_size, height, width, channels), you can also have shapes such as (None, None, None, 3), that means varying image sizes are allowed.

Edit:

tf.keras.layers.Input(shape = (None, None, 3)),
tf.keras.layers.Conv2D(16, (3,3), activation='relu')

Produces:

conv2d_21 (Conv2D)           (None, None, None, 16)    448

Regarding to your question, how are the parameters calculated even we passed image height & width as None?

Convolution parameters calculated according to:

(filter_height * filter_width * input_image_channels + 1) * number_of_filters

When we put them into formula,

filter_height = 3
filter_width = 3
input_image_channel = 3
number_of_filters = 16

Parameters = (3 x 3 x 3 + 1) * 16 = 28 * 16 = 448

Notice, we only needed input_image's channel number which is 3, representing that it is an RGB image.

If you want to calculate the params for later convolutions, you need to consider that the number of filters from previous layer becomes the number of channels for current layer's channel.

That's how you can end up having None params rather than batch_size. Keras needs to know if your image is RGB or not in that case. Or you won't specify the dimensions while creating the model and can pass them while fitting the model with the dataset.