pythonpytorchconv-neural-networkpruning

How to update a pretrained model after Pruning of filters in its conv layer in PyTorch?


I have a pretrained model LeNet5 defined from scratch. I am performing pruning over filters in the convolution layers present in the model shown below.

class LeNet5(nn.Module):

    def __init__(self, n_classes):
        super(LeNet5, self).__init__()
        self.feature_extractor = nn.Sequential(            
            nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5, stride=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(in_channels=20, out_channels=50, kernel_size=5, stride=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.classifier = nn.Sequential(
            nn.Linear(in_features=800, out_features=500),
            nn.ReLU(),
            nn.Linear(in_features=500, out_features=10), # 10 - possible classes
        )
    
    def forward(self, x):
        #x = x.view(x.size(0), -1) 
        x = self.feature_extractor(x)
        x = torch.flatten(x, 1)
        logits = self.classifier(x)
        probs = F.softmax(logits, dim=1)
        return logits, probs

I have successfully removed 2 filters from 20 in layer 1 (now 18 filters in conv2d layer1) and 5 filters from 50 in layer 2 (now 45 filters in conv2d layer3). So, now I need to update the model with the changes done as follows -

However, I'm unable to run the model as it gives dimension error.

RuntimeError: mat1 and mat2 shapes cannot be multiplied (32x720 and 800x500)

How to update the no. of filters layers present in the model using Pytorch to perform pruning? Is there any library I can use for the same?


Solution

  • Assuming you do not want the model to automatically change structure during runtime, you can easily update the structure of the model by simply changing the input parameters to the constructor. For instance:

    nn.Conv2d(in_channels = 1, out_channels = 18, kernel_size = 5, stride = 1),
    nn.Conv2d(in_channels = 18, out_channels = 45, kernel_size = 5, stride = 1),
    

    and so on.

    If you are retraining from scratch every time you change the model structure, that's all you need to do. However, if you would like to maintain portions of the already learned parameters when you change the model, you'll need to select these relevant values and reassign them to the model parameters. For instance, consider the parameters associated with the first convolutional layer, 1 input, 20 outputs, and kernel size of 5. The weights and biases for this layer have size [1,20,5,5] and [1,20]. You need to modify these parameters such that they have size [1,18,5,5] and [1,18]. You'd thus need the indices for the particular kernels/filters you want to maintain and which kernels you'd like to prune. The code syntax for doing this is roughly:

    params = net.state_dict()
    params["feature_extractor"]["conv1.weight"] = params["feature_extractor"]["conv1.weight"][:,:18,:,:]
    params["feature_extractor"]["conv1.bias"] = params["feature_extractor"]["conv1.bias"][:,:18]
    # and so on for the other layers
    
    net.load_state_dict(params)
    

    Here, I simply drop the last two kernels/bias values for the first convolutional layer. (Note that the actual dictionary key names may differ slightly; I didn't code this up to check because, as indicated in the comments above, you included a picture of code rather than real, copy-able, code so try to do the latter in the future.)