[SOLVED] When using Stochastic Gradient Descent with Shogun NeuralNetwork, everything classified the same

When using Stochastic Gradient Descent with Shogun NeuralNetwork, everything classified the same

I am attempting to classify a number of samples as 1 or 0 but when using Stochastic Gradient Descent as the optimization algorithm everything is classified as either a 1 or a 0.

When using the default (L-BFGS), it works as expected and classifies samples as both 1 and 0. I have tried adjusting the momentum, learning rate, batch size, decay and error coefficient but the error is the same every time. Any help would be greatly appreciated!

num_feats = X_train.get_num_features()
layers = DynamicObjectArray()
layers.append_element(NeuralInputLayer(num_feats))
layers.append_element(NeuralLogisticLayer(16))
layers.append_element(NeuralLogisticLayer(8))
layers.append_element(NeuralSoftmaxLayer(2))

MLP = NeuralNetwork(layers)
MLP.set_gd_momentum(0.9)
MLP.set_gd_learning_rate(0.001)
MLP.set_gd_mini_batch_size(200)
MLP.set_optimization_method(0)

MLP.set_l2_coefficient(1e-4)
MLP.set_epsilon(1e-8)
MLP.set_max_num_epochs(200)

MLP.quick_connect()
MLP.initialize_neural_network()
MLP.set_labels(y_train)

MLP.train
conf_mat_MLP = acc.get_confusion_matrix(y_pred_MLP, y_test)
print(conf_mat_MLP)

Prints:

[[2400    0]
[ 314    0]]

Line to declare SGD not L-BFGS:

MLP.set_optimization_method(0)

Note: I have used Stochastic Gradient Descent in the same way with the exact same train/test set in both Scikit-learn and Weka - both of which do not produce this error so I expect it is something to do with the way I am configuring the algorithm but I have no idea what!

Potentially useful links -

Docs: http://www.shogun-toolbox.org/api/latest/classshogun_1_1CNeuralNetwork.html

Source: http://www.shogun-toolbox.org/api/latest/NeuralNetwork_8h_source.html

Solution

you should lower (significantly) your mini-batch size - try with 20 or so.