For my thesis, I'm running a 4 layered deep network for sequence to sequence translation use-case 150 x Conv(64,5) x GRU (100) x softmax activation on last stage with loss='categorical_crossentropy'.
Training loss and accuracy converge optimally pretty quickly where as validation loss and accuracy seem to be stuck in val_acc 97 to 98.2 range, unable to go past beyond that.
Is my model overfitting?
Have tried dropout of 0.2 between layers.
Output after drop-out
Epoch 85/250
[==============================] - 3s - loss: 0.0057 - acc: 0.9996 - val_loss: 0.2249 - val_acc: 0.9774
Epoch 86/250
[==============================] - 3s - loss: 0.0043 - acc: 0.9987 - val_loss: 0.2063 - val_acc: 0.9774
Epoch 87/250
[==============================] - 3s - loss: 0.0039 - acc: 0.9987 - val_loss: 0.2180 - val_acc: 0.9809
Epoch 88/250
[==============================] - 3s - loss: 0.0075 - acc: 0.9978 - val_loss: 0.2272 - val_acc: 0.9774
Epoch 89/250
[==============================] - 3s - loss: 0.0078 - acc: 0.9974 - val_loss: 0.2265 - val_acc: 0.9774
Epoch 90/250
[==============================] - 3s - loss: 0.0027 - acc: 0.9996 - val_loss: 0.2212 - val_acc: 0.9809
Epoch 91/250
[==============================] - 3s - loss: 3.2185e-04 - acc: 1.0000 - val_loss: 0.2190 - val_acc: 0.9809
Epoch 92/250
[==============================] - 3s - loss: 0.0020 - acc: 0.9991 - val_loss: 0.2239 - val_acc: 0.9792
Epoch 93/250
[==============================] - 3s - loss: 0.0047 - acc: 0.9987 - val_loss: 0.2163 - val_acc: 0.9809
Epoch 94/250
[==============================] - 3s - loss: 2.1863e-04 - acc: 1.0000 - val_loss: 0.2190 - val_acc: 0.9809
Epoch 95/250
[==============================] - 3s - loss: 0.0011 - acc: 0.9996 - val_loss: 0.2190 - val_acc: 0.9809
Epoch 96/250
[==============================] - 3s - loss: 0.0040 - acc: 0.9987 - val_loss: 0.2289 - val_acc: 0.9792
Epoch 97/250
[==============================] - 3s - loss: 2.9621e-04 - acc: 1.0000 - val_loss: 0.2360 - val_acc: 0.9792
Epoch 98/250
[==============================] - 3s - loss: 4.3776e-04 - acc: 1.0000 - val_loss: 0.2437 - val_acc: 0.9774
The case you presented is a really complexed one. In order to answer your question if overfitting is actually happening in your case you need to answer two questions:
0.5
treshold).So - after checking these two concerns you may get an answer if your model overfit. The behaviour you presented is really nice - and what could be the actual reason behind is that there are few patterns in a validation set which are not properly covered in a training set. But this is something you should always take into account when you are designing a Machine Learning solution.