rdeep-learningneural-networknnet

Neuralnet RMSE is 10x bigger than linear model's RMSE on test data set


Working with the Boston Housing data set. The linear model is very easy:

library(e1071) # for tune.nnet
library(MASS) # for the Boston housing data set
library(Metrics) # to calculate RMSE
library(nnet)

df <- MASS::Boston
train <- df[1:100, ]
test <- df[101:505, ]
Boston_lm <- lm(medv ~ ., data = train)
lm_rmse <- Metrics::rmse(actual = train$medv, predicted = Boston_lm$fitted.values)
#RMSE = 2.037201

However, the tuned neuralnet returns an RMSE on the test set that is more than 10x higher than the linear model's results:

Boston_tune_nnet <- e1071::tune.nnet(x = train[, 1:ncol(train)-1], y = train$medv, size = 1)
nnet_tune_rmse <- Metrics::rmse(actual = train$medv, predicted = Boston_tune_nnet$best.model$fitted.values)
#RMSE = 22.11024

What's the correct way to build a tuned neuralnet model in this situation?


Solution

  • You are performing a linear regression, yet in the nnet, you are doing logistic regression(by default) with 1 added layer.

    Using the linout = TRUE ie linear regression and also ensuring that we skip any intermediary layers ie No hidden layers, we should get the same results:

    library(nnet)
    Boston_tune_nnet <- e1071::tune.nnet(x = train[, -ncol(train)],
                                         y = train$medv, size = 0, linout = TRUE,
                                        skip = TRUE)
    (nnet_tune_rmse <- Metrics::rmse(actual = train$medv,
            predicted = Boston_tune_nnet$best.model$fitted.values))
    #[1] 2.037201
    

    This is what you are looking for

    Also note that train[, 1:ncol(train) - 1] is incorrect. You should do train[, 1:(ncol(train) - 1)]