I want to make a one neuron function like w1x1+w2x2+w3*x3+b1 My training input is
[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[1, 1, 0],
[0, 1, 1],
[1, 1, 1],
[2, 0, 0]
And training output is :
[1,2,0,1,0,2,3]
I tried to make a code with one hot encoding but i failed. I am new to AI coding and I don't want to use any AI libraries such as Pytorch and Tensorflow or scikitlearn. This problem was troubling me for 2 weeks and I also tried different codes which didnt work. Here is an example code which I know it is wrong but It may give you an insight.
import numpy as np
import pandas as pd
def sigmoid(x):
return 1 /(1+np.exp(-x))
def sigmoid_derivative(x):
return x*(1-x)
training_inputs = np.array([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[1, 1, 0],
[0, 1, 1],
[1, 1, 1],
[2, 0, 0]
])
training_outputs = np.array([[1,2,0,1,0,2,3]]).T
np.random.seed(1)
synaptic_weights = 2 * np.random.random((3,1))-1
print('Random starting synaptic weights: ')
print(synaptic_weights)
for iteration in range(2000):
input_layer = training_inputs
outputs = sigmoid(np.dot(input_layer, synaptic_weights))
error = training_outputs - outputs
adjustments = error * sigmoid_derivative(outputs)
synaptic_weights += np.dot(input_layer.T,adjustments)
print('Synaptic Weights After Training: ')
print(synaptic_weights)
print('Outputs after training: ')
print(outputs)
I want it to output results like [0,1,0,0] meaning 1 as one hot encoded. Which I dont know how to.
We can assume this is a classification problem seeing you using sigmoid.
However, you've got 4 classes, so we'll need to use its multiclass generalization, the softmax function.
As you properly noticed, we'll need to use one-hot encoding for the target: your predictions should be the same shape as training_outputs
. It can be done trivially using np.eye()
.
Next, you mention b
in the linear equation, so you'll have to add a bias column to the training_inputs
(all ones).
As a result, the weights will become a 4x4 matrix (4 features including bias and 4 classes).
Finally, the learning rate might be needed to be reduced a bit.
No additional derivatives need to be calculated: the derivative of log loss w.r.t. weights is still proportional to X * error
.
This could probably be improved further if we scaled the training data.
Worth noting multiclass classification is not quite one neuron but rather one neuron per class.
import numpy as np
import pandas as pd
from scipy.special import softmax
training_inputs = np.array([
[1, 0, 0],
[0, 1, 0],
[0, 0, 1],
[1, 1, 0],
[0, 1, 1],
[1, 1, 1],
[2, 0, 0]
])
# Adding bias
training_inputs = np.hstack([training_inputs, np.ones((training_inputs.shape[0], 1))])
training_outputs = np.array([1,2,0,1,0,2,3])
# One-hot encoding
training_outputs = np.eye(4)[training_outputs]
np.random.seed(1)
synaptic_weights = 2 * np.random.random((4,4))-1
print('Random starting synaptic weights: ')
print(synaptic_weights)
for iteration in range(2000):
input_layer = training_inputs
# Softmax here
outputs = softmax(np.dot(input_layer, synaptic_weights), axis=1)
error = training_outputs - outputs
synaptic_weights += np.dot(input_layer.T,error) * 0.1 # learning rate
print('Synaptic Weights After Training: ')
print(synaptic_weights)
print('Outputs after training (one-hot): ')
print(np.round(outputs))
print('Outputs after training: ')
print(np.argmax(outputs, axis=1))