So, I would like to create a small proof-of-concept using (already extracted in txt files) +- 4.000 legal text divided in:
PS.: all text files are in brazilian portuguese (pt-br)
So how can I use these txt files to train a new transformer able to generate new summaries (using flan-t5) ?
I wrote a post and published a Colab about how this, if you want all of the details and code. (Post), (Colab Notebook)
The basic steps that I would recommend are:
Another way of doing it would be to fine-tune all of the model weights without using adapter methods, but that takes longer and uses more VRAM, without improving performance noticeably.
Note: Flan-T5 was mostly trained on English text, which means it won't perform as well on other languages.