I trained two different type of ResNet model from torchvision.models
which is ResNet50 with DEFAULT
weight and ResNet101 with DEFAULT
weight too
but the results of training is really weird, the train accuracy and test accuracy of ResNet50 model is 89 and 85 respectively and for ResNet101 is 34, 28 !
what is wrong?
I froze the entire models and just trained the FC layer which is modified to have 4 output(equals to length of the classes)
I used 5 epochs for both.
why ResNet101 is worse than ResNet50??
shouldn't this be better? because has more layers(depth)
shouldn't this be better? because has more layers(depth)
More layers means the model is more complex and has more capacity, but that doesn't mean it will perform better. Deeper models can struggle to converge, resulting in both a lower train and validation score compared to a simpler one. I think this is what's happening in your case.
Try training the larger model for more epochs and with a different learning rate. More epochs give the model more time to adapt. I'd start with a smaller learning rate, and see how the model responds to larger rates. Change one thing at a time and observe its effect.
If it converges fine, it can still go too far and overfit, which means it'll score highly on the training set but less well on the validation set compared to a simpler model. This will be exacerbated if the dataset is relatively small and non-diverse.