So, I was trying to implement AlexNet on the Intel image dataset for classification. However, although during training I get high accuracy scores (0.84), validation accuracy does not change and it is very low (0.16). I have tried different optimizers and learning rates and it didn't help.
Thank you for your help.
Data consist of 14k training and 3k test data, there are 6 classes. Here are the shapes of the datasets:
Train X shape: (14034, 150, 150, 3)
Test X shape: (3000, 150, 150, 3)
Train Y shape: (14034, 6)
Test Y shape: (3000, 6)
Here is the code:
# Creating AlexNet network
model = keras.Sequential()
# Layer 1
model.add(Conv2D(96, (11, 11), strides=4, activation='relu', input_shape=(150, 150, 3)))
model.add(MaxPool2D(pool_size=(3, 3), strides=2))
# Layer 2
model.add(Conv2D(256, (5, 5), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3, 3), strides=2))
# Layer 3, 4, and 5
model.add(Conv2D(384, (3, 3), strides=1, padding='same', activation='relu'))
model.add(Conv2D(384, (3, 3), strides=1, padding='same', activation='relu'))
model.add(Conv2D(256, (3, 3), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3, 3), strides=2))
# Layer 6
model.add(Flatten())
model.add(Dense(4096))
# Layer 7
model.add(Dropout(0.5))
model.add(Dense(4096))
# Layer 8
model.add(Dropout(0.5))
model.add(Dense(6, activation='softmax'))
opt = SGD(learning_rate=0.01)
model.compile(optimizer=opt,
loss='categorical_crossentropy',
metrics=['accuracy'])
train_dataset = tf.data.Dataset.from_tensor_slices((train_X, train_Y)).batch(128)
test_dataset = tf.data.Dataset.from_tensor_slices((test_X, test_Y)).batch(128)
history = model.fit(train_dataset, epochs=70, validation_data=test_dataset, shuffle=True)
Here is the output:
Epoch 1/70
110/110 [==============================] - 413s 4s/step - loss: 0.5867 - accuracy: 0.8835 - val_loss: 6.4170 - val_accuracy: 0.1670
Epoch 2/70
110/110 [==============================] - 421s 4s/step - loss: 0.8547 - accuracy: 0.7973 - val_loss: 5.3743 - val_accuracy: 0.1670
Epoch 3/70
67/110 [=================>............] - ETA: 2:43 - loss: 0.8841 - accuracy: 0.7100 - val_loss: 4.8517 - val_accuracy: 0.1670
val_accuracy does not change at all.
Apparently the model didn't even train on the validation set. I have solved this issue by using Adam optimizer instead of SGD and using a smaller learning rate, 0.001 instead of 0.01.