tensorflowneural-networkmnist

neural network - Predict MNIST digits only with one neuron in the output layer


Is it possible to create a neural network where we have one neuron in the output layer which can directly predict digit from MNIST dataset after training? e.g., if we give digit 3 as input, the output layer neuron should give output value close to 3.

Note: There might be any number of neurons with any number of hidden layers.

This is what I have tried using Tensorflow.

import tensorflow as tf
from mnist import MNIST
import numpy as np


inputs = tf.placeholder(tf.float32, shape=(1, 784))
labels = tf.placeholder(tf.float32, shape=(1, 1))

logits = tf.layers.dense(inputs, 1)
loss = 9.0 * tf.sigmoid(logits) - labels # As we want predicted value in [0 - 9]

train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

init = tf.global_variables_initializer()

with tf.Session() as sess:
  sess.run(init)
  mnist = MNIST()
  for i in range(100001):
    data, label =  mnist.get_train_data()
    lab = np.zeros((1,1), np.float32)
    lab[0][0] = label
    _, _loss, _logits= sess.run([train_op, loss, logits], feed_dict={inputs: np.reshape(data, (1, 784)), labels: lab})
    if i%5000 == 0:
      print("Step: %d Loss: %6f <== logits %s, Actual: %6f" % (i, _loss, str(_logits),  label))
Step: 0 Loss: -0.436195 <== logits [[ 0.02835961]], Actual: 5.000000
Step: 5000 Loss: -6.999933 <== logits [[-11.80182171]], Actual: 7.000000
Step: 10000 Loss: -2.999990 <== logits [[-13.7065649]], Actual: 3.000000
Step: 15000 Loss: -4.999864 <== logits [[-11.09644413]], Actual: 5.000000
Step: 20000 Loss: -5.000000 <== logits [[-17.01583481]], Actual: 5.000000
Step: 25000 Loss: -2.999971 <== logits [[-12.66251564]], Actual: 3.000000
Step: 30000 Loss: -2.999927 <== logits [[-11.72266102]], Actual: 3.000000
Step: 35000 Loss: -0.999898 <== logits [[-11.38729763]], Actual: 1.000000
Step: 40000 Loss: -7.000000 <== logits [[-17.59585381]], Actual: 7.000000
Step: 45000 Loss: -3.000000 <== logits [[-17.72655296]], Actual: 3.000000
Step: 50000 Loss: -5.000000 <== logits [[-16.65830421]], Actual: 5.000000
Step: 55000 Loss: -6.999999 <== logits [[-15.97771645]], Actual: 7.000000
Step: 60000 Loss: -3.000000 <== logits [[-17.10641289]], Actual: 3.000000
Step: 65000 Loss: -4.999984 <== logits [[-13.26896667]], Actual: 5.000000
Step: 70000 Loss: -5.000000 <== logits [[-19.57778549]], Actual: 5.000000
Step: 75000 Loss: -2.999995 <== logits [[-14.30502892]], Actual: 3.000000
Step: 80000 Loss: -2.999982 <== logits [[-13.13857365]], Actual: 3.000000
Step: 85000 Loss: -0.999971 <== logits [[-12.63682747]], Actual: 1.000000
Step: 90000 Loss: -7.000000 <== logits [[-19.08620071]], Actual: 7.000000
Step: 95000 Loss: -3.000000 <== logits [[-19.23719406]], Actual: 3.000000
Step: 100000 Loss: -5.000000 <== logits [[-17.85402298]], Actual: 5.000000

Solution

  • It is possible of course, but it is not a good idea. Digit recognition is a classification problem. By only using a single output neuron you are proposing to treat it as a regression problem. The implicit assumption you are making is that numbers that are close to each other numerically also look similar. This is obviously not the case. For instance, 3 and 5 look more similar than 3 and 4 as the bottom part is the same.