tensorflowkerasdeep-learning

Expected validation accuracy for Keras Mobile Net V1 for CIFAR-10 (training from scratch)


Has anybody trained Mobile Net V1 from scratch using CIFAR-10? What was the maximum accuracy you got? I am getting stuck at 70% after 110 epochs. Here is how I am creating the model. However, my training accuracy is above 99%.

#create mobilenet layer

MobileNet_model = tf.keras.applications.MobileNet(include_top=False, weights=None)

# Must define the input shape in the first layer of the neural network

x = Input(shape=(32,32,3),name='input')

#Create custom model

model = MobileNet_model(x)

model = Flatten(name='flatten')(model)

model = Dense(1024, activation='relu',name='dense_1')(model)

output = Dense(10, activation=tf.nn.softmax,name='output')(model)

model_regular = Model(x, output,name='model_regular')

I used Adam optimizer with a LR= 0.001, amsgrad = True and batch size = 64. Also normalized pixel data by dividing by 255.0. I am not using any Data Augmentation.

optimizer1 = tf.keras.optimizers.Adam(lr=0.001, amsgrad=True)

model_regular.compile(optimizer=optimizer1, loss='categorical_crossentropy', metrics=['accuracy'])

history = model_regular.fit(x_train, y_train_one_hot,validation_data=(x_test,y_test_one_hot),batch_size=64, epochs=100)  # train the model

I think I am supposed to get at least 75% according to https://arxiv.org/abs/1712.04698 Am I am doing anything wrong or is this the expected accuracy after 100 epochs. Here is a plot of my validation accuracy.

enter image description here


Solution

  • Mobilenet was designed to train Imagenet which is much larger, therefore train it on Cifar10 will inevitably result in overfitting. I would suggest you plot the loss (not acurracy) from both training and validation/evaluation, and try to train it hard to achieve 99% training accuracy, then observe the validation loss. If it is overfitting, you would see that the validation loss will actually increase after reaching minima.

    A few things to try to reduce overfitting:

    Then there are some usual training tricks:

    With some hyperparameter search, I got evaluation loss of 0.85. I didn't use Keras, I wrote the Mobilenet myself using Tensorflow.