pythonpytorchneural-networksavepruning

Pruned model size is the same of non-pruned model [PyTorch]


I'm trying to implement model pruning on PyTorch with a ResNet18. Given an instance of ResNet18, I run the following code to load a pre-trained model, prune it and save the pruned model:

def random_unstructured_pruning(pruning_rate: float, device, log_file):
    trained_model=retrieve_file(folder="./models", file_name='trained_model.pth')
    model=ResNet18(num_classes=10, input_channels=1).to(device)
    model.load_state_dict(torch.load(trained_model))
    
    modules_list=filter(lambda x: isinstance(x[1], (nn.Conv2d, nn.Linear, nn.BatchNorm2d)), model.named_modules())
    modules_list = map(lambda x: (x[1], 'weight'), modules_list)
    modules_list=tuple(modules_list)
 
  
    prune.global_unstructured(modules_list, pruning_method=prune.L1Unstructured, amount=0.8)
    for module in modules_list:
        prune.remove(module[0], module[1])
        
        
        
    pruning_rate_str= "{:02d}".format(int(pruning_rate * 10))
    path=f"{model_saving_path}pruned_{pruning_rate_str}.pth"
    # 
    torch.save(model.state_dict(), f"{path}")

In the end of the above function, the .pth file has the same dimension of the file that I load at the beginning while I expect it to be smaller since I'm pruning 80% of the weights.

Can somebody explain me why does it happen? What am I wrong? Thank you!!

I think that the problem is in the saving part of the function, it seems that I'm saving always the same model that I re-load at the beginning and the pruning is not effective.


Solution

  • The pruning utility in Pytorch acts as a masking wrapper on the layer that receives the pruning. This means you still have access to the original model weights and the network size remains unchanged, if not larger because of the initialization of a mask for each pruned tensor.

    If you look at the documentation page for prune.global_unstructured:

    Modifies modules in place by:

    • adding a named buffer called name+'_mask' corresponding to the binary mask applied to the parameter name by the pruning method.

    • replacing the parameter name by its pruned version, while the original (unpruned) parameter is stored in a new parameter named name+'_orig'.

    Here is a minimal example to show that the unpruned weights are still accessible:

    net = nn.Sequential(OrderedDict(
        f1=nn.Linear(10, 5),
        f2=nn.Linear(5, 1)))
    
    pruned = ((net.f1, 'weight'),)
    prune.global_unstructured(pruned, prune.L1Unstructured, amount=0.8)
    

    Then you can access the pruned weights:

    >>> net.f1.weight
    tensor([[0.0000, 0.0000, 0.0000, -0.0000, 0.0000],
            [0.0000, 0.3599, 0.0000, -0.0000, 0.4034]])
    

    The original unpruned weight:

    >>> net.f1.weight_orig
    Parameter containing:
    tensor([[ 0.1312,  0.1105,  0.0910, -0.2650,  0.3439],
            [ 0.0412,  0.3599,  0.2040, -0.2672,  0.4034]])
    

    And the pruning mask:

    >>> net.f1.weight_mask
    tensor([[0., 0., 0., 0., 0.],
            [0., 1., 0., 0., 1.]])