tensorflowkerasdeep-learningclassificationconv-neural-network

Gender detection and age classification using deep learning


I am working on a project where the goal is to detect gender and classify image. I did some research and found out a research paper: AgeandGenderClassificationusingConvolutionalNeuralNetworks by Gil Levi and Tal Hassnar. I tried to replicate the deep network created by them, originally in Caffe, in Keras. But the problem is the model is stuck at 50% accuracy(basically random coin toss). What is it that i am doing wrong. Any help is much appreciated. BTW i am using the adience data set as the original paper. PS: I have completely removed the LRN Layers as they are not available in Keras. ( I think their absence should not hurt the accuracy of the model ) Here is the code.

#imports
import os
import numpy as np
from PIL import Image
import pickle
from keras.models import Sequential 
from keras.callbacks import  ModelCheckpoint
from keras.layers import Dense , Conv2D , Flatten , MaxPooling2D , Dropout , AveragePooling2D
from keras import initializers
from keras import optimizers

# creating the model object
gender_model = Sequential()

# adding layers to the model
# first convolutional layer
gender_model.add( Conv2D(96 , kernel_size=(7,7) , activation='relu' , strides=4 , input_shape=(227,227,3),
                            kernel_initializer= initializers.random_normal(stddev=0.01), use_bias = 1,
                            bias_initializer = 'Zeros' , data_format='channels_last'))

gender_model.add( MaxPooling2D(pool_size=3 , strides=2) )

gender_model.add( Conv2D(256, kernel_size=(5,5) , activation='relu', strides=1 , padding='same' , input_shape=(27,27,96), 
                            kernel_initializer= initializers.random_normal(stddev=0.01) , use_bias=1,
                            bias_initializer='Ones' , data_format='channels_last') )
gender_model.add( MaxPooling2D(pool_size=3 , strides=2) )
# third convolutional layer

gender_model.add( Conv2D(384,kernel_size=(3,3) , activation='relu', strides=1 ,padding='same', input_shape=(13,13,256),
                    kernel_initializer= initializers.random_normal(stddev=0.01), use_bias=1,
                    bias_initializer = 'Zeros' , data_format='channels_last') )
gender_model.add( MaxPooling2D(pool_size=3 , strides=2) )

# Now we flatten the output of last convolutional layer
gender_model.add( Flatten() )

# Now we connect the fully connected layers
gender_model.add( Dense(512, activation='relu' , use_bias=1, kernel_initializer=initializers.random_normal(stddev=0.005),
                    bias_initializer='Ones') )
gender_model.add( Dropout(0.5))

# connecting another fully connected layer
gender_model.add( Dense(512 , activation='relu' , use_bias=1, kernel_initializer=initializers.random_normal(stddev=0.005),
                    bias_initializer='Ones'))
gender_model.add( Dropout(0.5))

# connecting the final layer
gender_model.add( Dense(2, activation='softmax' , use_bias=1, kernel_initializer=initializers.random_normal(stddev=0.01),
                    bias_initializer='Zeros'))

# compiling the model
sgd_optimizer = optimizers.SGD(lr= 0.0001 , decay=1e-7 , momentum=0.0, nesterov=False)
gender_model.compile(optimizer=sgd_optimizer , loss= 'categorical_crossentropy' , metrics=['accuracy'])
gender_model.summary()


# partioning the loaded data
X = np.load('/content/drive/My Drive/X.npy')
y = np.load('/content/drive/My Drive/y_m.npy')

X_train = X[:15000]
y_train = y[:15000]

X_val = X[15000:] 
y_val = y[15000:]

## creating chkpt path
chkpt_path = 'weights-improvement-{epoch:02d}--{val_acc:.2f}.hdf5'

checkpoint = ModelCheckpoint(chkpt_path , monitor='val_acc' , verbose=1 , save_best_only=True , mode='max')
callback_list = [checkpoint]

#finally training the model
gender_model.fit(X_train, y_train,
          batch_size=50,
          epochs=100,
          validation_data=(X_val , y_val),
          shuffle=1,
          callbacks = callback_list
          )

Solution

  • There was a problem in my approach. I was not cropping the faces. And hence the model could not make sense of the random background in every image.