python machine-learning neural-network keras linear-regression

Simple Linear Regression using Keras

I have been trying to implement a simple linear regression model using neural networks in Keras in hopes to understand how do we work in Keras library. Unfortunately, I am ending up with a very bad model. Here is the implementation:

from pylab import *
from keras.models import Sequential
from keras.layers import Dense

#Generate dummy data
data = data = linspace(1,2,100).reshape(-1,1)
y = data*5

#Define the model
def baseline_model():
   model = Sequential()
   model.add(Dense(1, activation = 'linear', input_dim = 1))
   model.compile(optimizer = 'rmsprop', loss = 'mean_squared_error', metrics = ['accuracy'])
   return model


#Use the model
regr = baseline_model()
regr.fit(data,y,epochs =200,batch_size = 32)
plot(data, regr.predict(data), 'b', data,y, 'k.')

The generated plot is as follows:

Can somebody point out the flaw in the above definition of the model (which could ensure a better fit)?

Solution

You should increase the learning rate of optimizer. The default value of learning rate in RMSprop optimizer is set to 0.001, therefore the model takes a few hundred epochs to converge to a final solution (probably you have noticed this yourself that the loss value decreases slowly as shown in the training log). To set the learning rate import optimizers module:

from keras import optimizers

# ...
model.compile(optimizer=optimizers.RMSprop(lr=0.1), loss='mean_squared_error', metrics=['mae'])

Either of 0.01 or 0.1 should work fine. After this modification you may not need to train the model for 200 epochs. Even 5, 10 or 20 epochs may be enough.

Also note that you are performing a regression task (i.e. predicting real numbers) and 'accuracy' as metric is used when you are performing a classification task (i.e. predicting discrete labels like category of an image). Therefore, as you can see above, I have replaced it with mae (i.e. mean absolute error) which is also much more interpretable than the value of loss (i.e. mean squared error) used here.