neural-networkdeep-learningconv-neural-networklasagnenolearn

How to calculate the number of parameters for convolutional neural network?


I'm using Lasagne to create a CNN for the MNIST dataset. I'm following closely to this example: Convolutional Neural Networks and Feature Extraction with Python.

The CNN architecture I have at the moment, which doesn't include any dropout layers, is:

NeuralNet(
    layers=[('input', layers.InputLayer),        # Input Layer
            ('conv2d1', layers.Conv2DLayer),     # Convolutional Layer
            ('maxpool1', layers.MaxPool2DLayer), # 2D Max Pooling Layer
            ('conv2d2', layers.Conv2DLayer),     # Convolutional Layer
            ('maxpool2', layers.MaxPool2DLayer), # 2D Max Pooling Layer
            ('dense', layers.DenseLayer),        # Fully connected layer
            ('output', layers.DenseLayer),       # Output Layer
            ],
    # input layer
    input_shape=(None, 1, 28, 28),

    # layer conv2d1
    conv2d1_num_filters=32,
    conv2d1_filter_size=(5, 5),
    conv2d1_nonlinearity=lasagne.nonlinearities.rectify,

    # layer maxpool1
    maxpool1_pool_size=(2, 2),

    # layer conv2d2
    conv2d2_num_filters=32,
    conv2d2_filter_size=(3, 3),
    conv2d2_nonlinearity=lasagne.nonlinearities.rectify,

    # layer maxpool2
    maxpool2_pool_size=(2, 2),


    # Fully Connected Layer
    dense_num_units=256,
    dense_nonlinearity=lasagne.nonlinearities.rectify,

   # output Layer
    output_nonlinearity=lasagne.nonlinearities.softmax,
    output_num_units=10,

    # optimization method params
    update= momentum,
    update_learning_rate=0.01,
    update_momentum=0.9,
    max_epochs=10,
    verbose=1,
    )

This outputs the following Layer Information:

  #  name      size
---  --------  --------
  0  input     1x28x28
  1  conv2d1   32x24x24
  2  maxpool1  32x12x12
  3  conv2d2   32x10x10
  4  maxpool2  32x5x5
  5  dense     256
  6  output    10

and outputs the number of learnable parameters as 217,706

I'm wondering how this number is calculated? I've read a number of resources, including this StackOverflow's question, but none clearly generalizes the calculation.

If possible, can the calculation of the learnable parameters per layer be generalised?

For example, convolutional layer: number of filters x filter width x filter height.


Solution

  • Let's first look at how the number of learnable parameters is calculated for each individual type of layer you have, and then calculate the number of parameters in your example.

    The final difficulty is the first fully-connected layer: we do not know the dimensionality of the input to that layer, as it is a convolutional layer. To calculate it, we have to start with the size of the input image, and calculate the size of each convolutional layer. In your case, Lasagne already calculates this for you and reports the sizes - which makes it easy for us. If you have to calculate the size of each layer yourself, it's a bit more complicated:

    In your case, the full calculations are:

      #  name                           size                 parameters
    ---  --------  -------------------------    ------------------------
      0  input                       1x28x28                           0
      1  conv2d1   (28-(5-1))=24 -> 32x24x24    (5*5*1+1)*32   =     832
      2  maxpool1                   32x12x12                           0
      3  conv2d2   (12-(3-1))=10 -> 32x10x10    (3*3*32+1)*32  =   9'248
      4  maxpool2                     32x5x5                           0
      5  dense                           256    (32*5*5+1)*256 = 205'056
      6  output                           10    (256+1)*10     =   2'570
    

    So in your network, you have a total of 832 + 9'248 + 205'056 + 2'570 = 217'706 learnable parameters, which is exactly what Lasagne reports.