Fine-tuning model's classifier layer with new label

I would like to fine-tune already fine-tuned BertForSequenceClassification model with new dataset containing just 1 additional label which hasn't been seen by model before.

By that, I would like to add 1 new label to the set of labels that model is currently able of classifying properly.

Moreover, I don't want classifier weights to be randomly initialized, I'd like to keep them intact and just update them accordingly to the dataset examples while increasing the size of classifier layer by 1.

The dataset used for further fine-tuning could look like this:

sentece,label
intent example 1,new_label
intent example 2,new_label
...
intent example 10,new_label

My model's current classifier layer looks like this:

Linear(in_features=768, out_features=135, bias=True)

How could I achieve it?
Is it even a good approach?

Solution

You can just extend the weights and bias of your model with new values. Please have a look at the commented example below:

#This is the section that loads your model
#I will just use an pretrained model for this example
import torch
from torch import nn
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("jpcorb20/toxic-detector-distilroberta")
model = AutoModelForSequenceClassification.from_pretrained("jpcorb20/toxic-detector-distilroberta")
#we check the output of one sample to compare it later with the extended layer
#to verify that we kept the previous learnt "knowledge"
f = tokenizer.encode_plus("This is an example", return_tensors='pt')
print(model(**f).logits)

#Now we need to find out the name of the linear layer you want to extend
#The layers on top of distilroberta are wrapped inside a classifier section
#This name can differ for you because it can be chosen randomly
#use model.parameters instead find the classification layer
print(model.classifier)

#The output shows us that the classification layer is called `out_proj`
#We can now extend the weights by creating a new tensor that consists of the
#old weights and a randomly initialized tensor for the new label 
model.classifier.out_proj.weight = nn.Parameter(torch.cat((model.classifier.out_proj.weight, torch.randn(1,768)),0))

#We do the same for the bias:
model.classifier.out_proj.bias = nn.Parameter(torch.cat((model.classifier.out_proj.bias, torch.randn(1)),0))

#and be happy when we compare the output with our expectation 
print(model(**f).logits)

Output:

tensor([[-7.3604, -9.4899, -8.4170, -9.7688, -8.4067, -9.3895]],
       grad_fn=<AddmmBackward>)
RobertaClassificationHead(
  (dense): Linear(in_features=768, out_features=768, bias=True)
  (dropout): Dropout(p=0.1, inplace=False)
  (out_proj): Linear(in_features=768, out_features=6, bias=True)
)
tensor([[-7.3604, -9.4899, -8.4170, -9.7688, -8.4067, -9.3895,  2.2124]],
       grad_fn=<AddmmBackward>)

Please note, that you should fine-tune your model. The new weights are randomly initialized and will therefore negatively impact the performance.