I'm trying to build a simple neural network to classify product images to different labels (product types). i.e, given a new product image tell which product category type (books, toys, electronics etc.) it belongs to.
I have a couple of product images under each product number and each product number has a label (i.e., product type) in a excel sheet.
Below is my code:
from sklearn.preprocessing import LabelEncoder
from sklearn.cross_validation import train_test_split
from keras.models import Sequential
from keras.layers import Activation
from keras.optimizers import SGD
from keras.layers import Dense
from keras.utils import np_utils
from imutils import paths
import numpy as np
import argparse
import cv2
import os
import xlwt
import xlrd
import glob2
import pickle
def image_to_feature_vector(image, size=(32,32)):
return cv2.resize(image, size).flatten()
def read_data(xls = "/Desktop/num_to_product_type.xlsx"):
book = xlrd.open_workbook(xls)
sheet = book.sheet_by_index(0)
d = {}
for row_index in xrange(1, sheet.nrows): # skip heading row
prod_type, prod_num = sheet.row_values(row_index, end_colx=2)
prod_type = unicode(prod_type).encode('UTF8')
produ_num = unicode(prod_num).encode('UTF8')
d[prod_num] = prod_type
return d
def main():
try:
imagePaths=[]
print("[INFO] describing images...")
for path, subdirs, files in os.walk(r'/Desktop/data'):
for filename in files:
imagePaths.append(os.path.join(path, filename))
files = glob2.glob('/Desktop/data/**/.DS_Store')
for i in files:
imagePaths.remove(i)
except:
pass
dd = read_data()
# initialize the data matrix and labels list
data = []
labels1 = []
for (i, imagePath) in enumerate(imagePaths):
image = cv2.imread(imagePath)
#print(image.shape)
subdir = imagePath.split('/')[-2]
for k, v in dd.items():
if k == subdir:
label = v
break
features = image_to_feature_vector(image)
data.append(features)
labels1.append(label)
# show an update every 1,000 images
if i > 0 and i % 1000 == 0:
print("[INFO] processed {}/{}".format(i, len(imagePaths)))
print("String Labels")
print(labels1)
# encode the labels, converting them from strings to integers
le = LabelEncoder()
labels = le.fit_transform(labels1)
print(labels)
d={}
d[labels[0]] = labels1[0]
for i in range(1,len(labels)-1):
if labels[i-1] != labels[i] and labels[i] == labels[i+1]:
d[labels[i]] = labels1[i]
data = np.array(data) / 255.0
labels = np_utils.to_categorical(labels, 51)
print("To_Categorical")
print(labels)
print("[INFO] constructing training/testing split...")
(trainData, testData, trainLabels, testLabels) = train_test_split(
data, labels, test_size=0.25, random_state=42)
model = Sequential()
model.add(Dense(768, input_dim=3072, init="uniform",
activation="relu"))
model.add(Dense(384, init="uniform", activation="relu"))
model.add(Dense(51))
model.add(Activation("softmax"))
print("[INFO] compiling model...")
sgd = SGD(lr=0.125
)
model.compile(loss="categorical_crossentropy", optimizer=sgd,
metrics=["accuracy"])
model.fit(trainData, trainLabels, nb_epoch=50, batch_size=750)
# #Test the model
#show the accuracy on the testing set
print("[INFO] evaluating on testing set...")
(loss, accuracy) = model.evaluate(testData, testLabels,
batch_size=128, verbose=1)
print("[INFO] loss={:.4f}, accuracy: {:.4f}%".format(loss,
accuracy * 100))
if __name__ == '__main__':
main()
The neural network is a 3-2-3-51 feedforward neural network. Layer 0 contains 3 inputs. Layers 1 & 2 are hidden layers containing 2 & 3 nodes resp. Layer 3 is the output layer which has 51 nodes (i.e., for 51 product category type). However, with this I'm getting very low accuracy, only about 45-50%.
Is there something wrong that I'm doing? How do you increase the accuracy of the neural network? I read somewhere that it can be done by "crossvalidation and hyperparameter tuning
" but how is it done? Sorry, I'm very new at neural network, just trying something new. Thanks.
For creating an image classifier in keras I would suggest trying a convolutional neural network as they tend to work much better for images. Also, normalizing between layers can help with accuracy during training which should help yield a better validation/test accuracy. (The same concept as normalizing data before training.)
For a keras convolutional layer simply call model.add(Conv2D(params))
and to normalize between layers you can call model.add(BatchNormalization())
Convolutional neural networks are more advanced but better suited for images. The difference being that a convolutional is at a high level just a "mini" neural network scanning over patches of the image. This is important because for example you can have the EXACT same object in two images, but if they are in different places in that image a normal neural network would view that as two different objects vs the same object in different places in the images...
So this "mini" neural network that scans the image in patches (often referred to as the kernel size) is more inclined to pick up on similar features of objects. The object features are then trained into the network so even if the object is present in different areas of your images it can be more accurately recognized as the same thing. This is the key to why a convolutional neural network is better for working with images.
Here is a basic example in keras 2 with normalization based off of an NVIDIA model architecture...
model = Sequential()
# crop the images to get rid of irrelevant features if needed...
model.add(Cropping2D(cropping=((0, 0), (0,0)), input_shape=("your_input_shape tuple x,y,rgb_depth")))
model.add(Lambda(lambda x: (x - 128) / 128)) # normalize all pixels to a mean of 0 +-1
model.add(Conv2D(24, (2,2), strides=(2,2), padding='valid', activation='elu')) # 1st convolution
model.add(BatchNormalization()) # normalize between layers
model.add(Conv2D(36, (2,2), strides=(2,2), padding='valid', activation='elu')) # 2nd convolution
model.add(BatchNormalization())
model.add(Conv2D(48, (1,1), strides=(2,2), padding='valid', activation='elu')) # 3rd convolution
model.add(BatchNormalization())
# model.add(Conv2D(64, (3,3), strides=(1,1), padding='valid', activation='elu')) # 4th convolution
# model.add(BatchNormalization())
# model.add(Conv2D(64, (3,3), strides=(1,1), padding='valid', activation='elu')) # 4th convolution
# model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Flatten()) # flatten the dimensions
model.add(Dense(100, activation='elu')) # 1st fully connected layer
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(51, activation= 'softmax')) # label output as probabilites
Lastly, hyperparameter tuning is just adjusting batch sizes, epochs, learning rates etc to achieve the best result. All you can do there is experiment and see what works best.