I'm trying to get a Convolution Neural Network up and running to be able to play the old NES Ice Climbers. Right now I have utilized OpenCV to capture the screen for inputs and the output is the action of the iceclimber, such as walk left - right or jump. The problem I'm running into is the trained model doesn't actually learn properly or it's overfitting from when I validate it.
I've tried lowering the outputs by excluding the jump command. I've tried different batch sizes, epochs, and different test data. I've also tried changing the optimizer and dimensions but nothing had a significant impact.
Here is the code for when I'm capturing the screen and using that data to train my model. My training data is 900 sequential screen captures with the respective inputs pushed that I played. I have around 10k sequences saved from playing for the training data.
def screen_record():
global last_time
printscreen = np.array(ImageGrab.grab(bbox=(0,130,800,640)))
last_time = time.time()
processed = greycode(printscreen)
processed = cv2.resize(processed, (80, 60))
cv2.imshow('AIBOX', processed)
cv2.moveWindow("AIBOX", 500, 150);
#training.append([processed, check_input()])
processed = np.array(processed).reshape(-1, 80, 60, 1)
result = AI.predict(processed, batch_size=1)
print (result)
AI_Control_Access(result)
def greycode(screen):
greymap = cv2.cvtColor(screen, cv2.COLOR_BGR2GRAY)
greymap = cv2.Canny(greymap, threshold1=200, threshold2=300)
return greymap
def network_train():
train_data = np.load('ICE_Train5.npy')
train = train_data[::7]
test = train_data[-3::]
x_train = np.array([i[0] for i in train]).reshape(-1,80,60,1)
x_test = np.array([i[0] for i in test]).reshape(-1,80,60,1)
y_train = np.asarray([i[1] for i in train])
y_test = np.asarray([i[1] for i in test])
model = Sequential()
model.add(Convolution2D(32, (3, 3), activation='relu', input_shape=(80, 60, 1)))
model.add(Convolution2D(16, (5, 5), activation='relu', strides=4))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer='sgd')
model.fit(x_train, y_train,batch_size=450,epochs=50,verbose=1,callbacks=None,validation_split=0,validation_data=None,shuffle=False,
class_weight=None,sample_weight=None,initial_epoch=0,steps_per_epoch=None,validation_steps=None)
When I run it against the test data for validation the highest I could get was around 16%, but even when I use that model for actually playing the game it always predicts the same button pushed so I think it's either due to over fitting of the model or the model not learning at all, but since this is my first time using a convolution network i'm unsure how to tweak the network to be more responsive to training.
The general setup sounds more like a good environment for reinforcement learning.
If you want to stick with the supervised learning setup, you should first check if your different classes have the same amount of training examples. If that's the case, you could experiment with the learning rate, more regularization (dropout), the network architecture etc.