I am doing a computer vision project and I need to apply data augmentation. I have 3 classes : two classes with 500 images and a class with 1000 images. I am going to generate multiple versions of the images with data augmentation, should I apply for example 3 random transformations on the two first classes to have in total 2000 images and apply just one transformation on the final class to have 2000 total classes? Finally should the data augmentation be applied on the whole dataset then separate it into train and test or separate it then apply the augmentation on the train dataset. Thank you
Data Augmentation is applied to only training set. Don't touch test set.
Apply augmentation randomly in training. So a particular image may or may not be augmented in a particular epoch.
No need to treat classes separately to deal with class-imbalance. Class imbalance is handled with appropriate loss functions such as cross-entropy or focal loss function in retinanet.