tensorflowkerasparametersresnet

How Keras can calculate the number of parameters at early stage when there are still None dimensions?


Sorry for the very basic question (I'm new with Keras). I was wondering how Keras can calculate for each layer the number of parameters at an early stage (before fit) despite that model.summary shows that there are dimensions that still have None values at this stage. Are these values already determined in some way and if yes, why not show them in the summary?

I ask the question because I'm having a hard time figure out my "tensor shape bug" (I'm trying to determine the output dimensions of the the C5 block of my resnet50 model but I cannot see them in model.summary even if I see the number of parameters).

I give below an example based on C5_reduced layer in RetinaNet which is fed by C5 layer of Resnet50. The C5_reduced is

Conv2D(256,kernel_size=1,strides=1,pad=1)

Based on model.summary for this particular layer:

C5_reduced (Conv2D)    (None, None, None, 256)          524544 

I've made the guess that C5 is (None,1,1,2048) because 2048*256+256 = 524544 (I don't know how to confirm or infirm that hypothesis). So if it's already known, why not show it on summary? If dimensions 2 and 3 would have been different, the number of parameters would have been different too right?


Solution

  • If you pass exact input shape to your very first layer or input layer on your network, you will have the output that you want. For instance I used input layer here:

    input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
    _________________________________________________________________
    block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
    _________________________________________________________________
    block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
    

    Passed input as (224,224,3). 3 represents the depth here. Note that convolutional parameters' calculation differ from Dense layers' calculation.

    If you do such following:

    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(150, 150, 3))
    

    You will see:

    conv2d (Conv2D) ---> (None, 148, 148, 16)    
    

    Dimensions reduced to 148x148, in Keras padding is valid by default. Also strides is 1. Then the shape of output will be 148 x 148. (You can search for the formula.)

    So then what are None values?

    Edit:

    tf.keras.layers.Input(shape = (None, None, 3)),
    tf.keras.layers.Conv2D(16, (3,3), activation='relu')
    

    Produces:

    conv2d_21 (Conv2D)           (None, None, None, 16)    448       
    

    Regarding to your question, how are the parameters calculated even we passed image height & width as None?

    Convolution parameters calculated according to:

    (filter_height * filter_width * input_image_channels + 1) * number_of_filters
    

    When we put them into formula,

    filter_height = 3
    filter_width = 3
    input_image_channel = 3
    number_of_filters = 16
    

    Parameters = (3 x 3 x 3 + 1) * 16 = 28 * 16 = 448

    Notice, we only needed input_image's channel number which is 3, representing that it is an RGB image.

    If you want to calculate the params for later convolutions, you need to consider that the number of filters from previous layer becomes the number of channels for current layer's channel.

    That's how you can end up having None params rather than batch_size. Keras needs to know if your image is RGB or not in that case. Or you won't specify the dimensions while creating the model and can pass them while fitting the model with the dataset.