In an attempt to understand neural networks and pybrain, I tried to predict a sinusoidal function in noise using only the time index as input. The premise therefore is that a simple NN structure can mimic y(t) = sin(t).
The design is: 1 input layer (linear), 1 hidden layer (tanh) and 1 output layer (linear). The number of nodes is: 1,10,1 for each respective layer.
The input (time variable t) is scaled such that its range is [0;1]. The target is scaled either to have range [0;1] or [-1;1] with different results (given below).
Here is my Python 2.7 code:
#!/usr/bin/python
from __future__ import division
import numpy as np
import pylab as pl
from pybrain.structure import TanhLayer, LinearLayer #SoftmaxLayer, SigmoidLayer
from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.structure import FeedForwardNetwork
from pybrain.structure import FullConnection
np.random.seed(0)
pl.close('all')
#create NN structure:
net = FeedForwardNetwork()
inLayer = LinearLayer(1)
hiddenLayer = TanhLayer(10)
outLayer = LinearLayer(1)
#add classes of layers to network, specify IO:
net.addInputModule(inLayer)
net.addModule(hiddenLayer)
net.addOutputModule(outLayer)
#specify how neurons are to be connected:
in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)
#add connections to network:
net.addConnection(in_to_hidden)
net.addConnection(hidden_to_out)
#perform internal initialisation:
net.sortModules()
#construct target signal:
T = 1
Ts = T/10
f = 1/T
fs = 1/Ts
#NN input signal:
t0 = np.arange(0,10*T,Ts)
L = len(t0)
#NN target signal:
x0 = 10*np.cos(2*np.pi*f*t0) + 10 + np.random.randn(L)
#normalise input signal:
t = t0/np.max(t0)
#normalise target signal to fit in range [0,1] (min) or [-1,1] (mean):
dcx = np.min(x0) #np.min(x0) #np.mean(x0)
x = x0-dcx
sclf = np.max(np.abs(x))
x /= sclf
#add samples and train NN:
ds = SupervisedDataSet(1, 1)
for c in range(L):
ds.addSample(t[c], x[c])
trainer = BackpropTrainer(net, ds, learningrate=0.01, momentum=0.1)
for c in range(20):
e1 = trainer.train()
print 'Epoch %d Error: %f'%(c,e1)
y=np.zeros(L)
for c in range(L):
#y[c] = net.activate([x[c]])
y[c] = net.activate([t[c]])
yout = y*sclf
yout = yout + dcx
fig1 = pl.figure(1)
pl.ion()
fsize=8
pl.subplot(211)
pl.plot(t0,x0,'r.-',label='input')
pl.plot(t0,yout,'bx-',label='predicted')
pl.xlabel('Time',fontsize=fsize)
pl.ylabel('Amplitude',fontsize=fsize)
pl.grid()
pl.legend(loc='lower right',ncol=2,fontsize=fsize)
pl.title('Target range = [0,1]',fontsize=fsize)
fig1name = './sin_min.png'
print 'Saving Fig. 1 to:', fig1name
fig1.savefig(fig1name, bbox_inches='tight')
The output figures are given below.
Although the first figure shows better results, both outputs are unsatisfactory. Am I missing some fundamental neural network principle or is my code defective? I know there are easier statistical methods of estimating the target signal in this case, but the aim is to use a simple NN structure here.
The issue is the range of input and output values. By standardizing these signals, the problem is solved.
This can be explained by considering the sigmoid and tanh activation functions, displayed below.
The results will depend on the specific application. Also, changing the activation function (see this answer) will probably also affect the optimal scaling values of the input and output signals.