I'm using huggingface transformers 4.19.0
I want to pretrain BART model using my custom dataset.
To make it clear, I'm not asking about fine tuning BART to down stream task but asking about "pre training BART".
But I can't find method or class for this job on huggingface docs page (https://huggingface.co/docs/transformers/model_doc/bart)
Is it impossible to pretrain BART using transformers package?
Do I have to make BART model layer by layer from the scratch?
If anyone knows how to pretrain BART model using custom data please help me...
You need to initialize a random model with the architecture of your choice.
from transformers import BartConfig, BartModel
configuration = BartConfig() # default Bart config
model = BartModel(configuration) # default randomly initialised BART
Then you need to train said model, the easiest way being using a Trainer (doc), to which you provide your model, training sets, evaluation sets, etc.