neural-networktheanodeep-learninglasagne

How to properly add and use BatchNormLayer?


Introduction

According to the lasagne docs : "This layer should be inserted between a linear transformation (such as a DenseLayer, or Conv2DLayer) and its nonlinearity. The convenience function batch_norm() modifies an existing layer to insert batch normalization in front of its nonlinearity."

However lasagne also have the utility function :

lasagne.layers.batch_norm

However, due to implementation on my end, i cant use that function.

My Question is : How and Where should i add the BatchNormLayer?

class lasagne.layers.BatchNormLayer(incoming, axes='auto', epsilon=1e-4, alpha=0.1, beta=lasagne.init.Constant(0), gamma=lasagne.init.Constant(1), mean=lasagne.init.Constant(0), inv_std=lasagne.init.Constant(1), **kwargs)

Can i add it after a convolution layer? or should i add after the maxpool? Do i have to manually remove the bias of the layers?

Approach used I have used it like this, only, :

try:
        import lasagne
        import theano
        import theano.tensor as T

        input_var = T.tensor4('inputs')
        target_var = T.fmatrix('targets')

        network = lasagne.layers.InputLayer(shape=(None, 1, height, width), input_var=input_var)

        from lasagne.layers import BatchNormLayer

        network = BatchNormLayer(network,
                                 axes='auto',
                                 epsilon=1e-4,
                                 alpha=0.1,
                                 beta=lasagne.init.Constant(0),
                                 gamma=lasagne.init.Constant(1),
                                 mean=lasagne.init.Constant(0),
                                 inv_std=lasagne.init.Constant(1))

        network = lasagne.layers.Conv2DLayer(
            network, num_filters=60, filter_size=(3, 3), stride=1, pad=2,
            nonlinearity=lasagne.nonlinearities.rectify,
            W=lasagne.init.GlorotUniform())

        network = lasagne.layers.Conv2DLayer(
            network, num_filters=60, filter_size=(3, 3), stride=1, pad=1,
            nonlinearity=lasagne.nonlinearities.rectify,
            W=lasagne.init.GlorotUniform())


        network = lasagne.layers.MaxPool2DLayer(incoming=network, pool_size=(2, 2), stride=None, pad=(0, 0),
                                                ignore_border=True)


        network = lasagne.layers.DenseLayer(
            lasagne.layers.dropout(network, p=0.5),
            num_units=32,
            nonlinearity=lasagne.nonlinearities.rectify)


        network = lasagne.layers.DenseLayer(
            lasagne.layers.dropout(network, p=0.5),
            num_units=1,
            nonlinearity=lasagne.nonlinearities.sigmoid)


        return network, input_var, target_var

References:

https://github.com/Lasagne/Lasagne/blob/master/lasagne/layers/normalization.py#L120-L320

http://lasagne.readthedocs.io/en/latest/modules/layers/normalization.html


Solution

  • If not using batch_norm:

    Please test the code below and let us know if it works for what you are trying to accomplish. If it does not work, you can try adapting the batch_norm code.

    import lasagne
    import theano
    import theano.tensor as T
    from lasagne.layers import batch_norm
    
    input_var = T.tensor4('inputs')
    target_var = T.fmatrix('targets')
    
    network = lasagne.layers.InputLayer(shape=(None, 1, height, width), input_var=input_var)
    
    network = lasagne.layers.Conv2DLayer(
        network, num_filters=60, filter_size=(3, 3), stride=1, pad=2,
        nonlinearity=lasagne.nonlinearities.rectify,
        W=lasagne.init.GlorotUniform())
    network = batch_norm(network)
    
    network = lasagne.layers.Conv2DLayer(
        network, num_filters=60, filter_size=(3, 3), stride=1, pad=1,
        nonlinearity=lasagne.nonlinearities.rectify,
        W=lasagne.init.GlorotUniform())
    network = batch_norm(network)
    
    network = lasagne.layers.MaxPool2DLayer(incoming=network, pool_size=(2, 2), stride=None, pad=(0, 0),
                                            ignore_border=True)
    
    network = lasagne.layers.DenseLayer(
        lasagne.layers.dropout(network, p=0.5),
        num_units=32,
        nonlinearity=lasagne.nonlinearities.rectify)
    network = batch_norm(network)
    
    network = lasagne.layers.DenseLayer(
        lasagne.layers.dropout(network, p=0.5),
        num_units=1,
        nonlinearity=lasagne.nonlinearities.sigmoid)
    network = batch_norm(network)
    

    When getting the params to create the graph for you update method, remember to set trainable to True:

    params = lasagne.layers.get_all_params(l_out, trainable=True)
    updates = lasagne.updates.adadelta($YOUR_LOSS_HERE, params)`