I wrote a simple neural net/MLP and I'm getting some strange accuracy values and wanted to double check things.
This is my intended setup: features matrix with 913 samples and 192 features (913,192). I'm classifying 2 outcomes, so my labels are binary and have shape (913,1). 1 hidden layer with 100 units (for now). All activations will use tanh and all losses use l2 regularization, optimized with SGD
The code is below. It was written in python with the Keras framework (http://keras.io/) but my question isn't specific to Keras
input_size = 192
hidden_size = 100
output_size = 1
lambda_reg = 0.01
learning_rate = 0.01
num_epochs = 100
batch_size = 10
model = Sequential()
model.add(Dense(input_size, hidden_size, W_regularizer=l2(lambda_reg), init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(hidden_size, output_size, W_regularizer=l2(lambda_reg), init='uniform'))
model.add(Activation('tanh'))
sgd = SGD(lr=learning_rate, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd, class_mode="binary")
history = History()
model.fit(features_all, labels_all, batch_size=batch_size, nb_epoch=num_epochs, show_accuracy=True, verbose=2, validation_split=0.2, callbacks=[history])
score = model.evaluate(features_all, labels_all, show_accuracy=True, verbose=1)
I want to double check that the code I wrote is actually correct for what I want it to do in terms of my choice of parameters and their values etc.
Using the code above, I get training and test set accuracy hovering around 50-60%. Maybe I'm just using bad features, but I wanted to test to see what might be wrong, so I manually set all the labels and features to something that should be predictable:
labels_all[:500] = 1
labels_all[500:] = 0
features_all[:500] = np.ones(192)*500
features_all[500:] = np.ones(192)
So I set the first 500 samples to have a label of 1, everything else is labelled 0. I set all the features manually to 500 for each of the first 500 samples, and all other features (for the rest of the samples) get a 1
When I run this, I get training accuracy of around 65%, and validation accuracy around 0%. I was expecting both accuracies to be extremely high/almost perfect - is this incorrect? My thinking was that the features with extremely high values all have the same label (1), while the features with low values get a 0 label
Mostly I'm just wondering if my code/model is incorrect or whether my logic is wrong.
I don't know that library, so I can't tell you if this is correctly implemented, but it looks legit.
I think your problem lies with activation function - tanh(500)=1 and tanh(1)=0.76. This difference seem too small for me. Try using -1 instead of 500 for testing purposes and normalize your real data to something about [-2, 2]. If you need full real numbers range, try using linear activation function. If you only care about positive half on real numbers, I propose softplus or ReLU. I've checked and all those functions are provided with Keras.
You can try thresholding your output too - answer 0.75 when expecting 1 and 0.25 when expecting 0 are valid, but may impact you accuracy.
Also, try tweaking your parameters. I can propose (basing on my own experience) that you'd use:
I'd say that learning rate, number of epochs, momentum and lambda are the most important factors here - in order from most to least important.
PS. I've just spotted that you're initializing your weights uniformly (is that even a word? I'm not a native speaker...). I can't tell you why, but my intuition tells me that this is a bad idea. I'd go with random initial weights.