Working with the Boston Housing data set. The linear model is very easy:
library(e1071) # for tune.nnet
library(MASS) # for the Boston housing data set
library(Metrics) # to calculate RMSE
library(nnet)
df <- MASS::Boston
train <- df[1:100, ]
test <- df[101:505, ]
Boston_lm <- lm(medv ~ ., data = train)
lm_rmse <- Metrics::rmse(actual = train$medv, predicted = Boston_lm$fitted.values)
#RMSE = 2.037201
However, the tuned neuralnet returns an RMSE on the test set that is more than 10x higher than the linear model's results:
Boston_tune_nnet <- e1071::tune.nnet(x = train[, 1:ncol(train)-1], y = train$medv, size = 1)
nnet_tune_rmse <- Metrics::rmse(actual = train$medv, predicted = Boston_tune_nnet$best.model$fitted.values)
#RMSE = 22.11024
What's the correct way to build a tuned neuralnet model in this situation?
You are performing a linear regression, yet in the nnet, you are doing logistic regression(by default) with 1 added layer.
Using the linout = TRUE
ie linear regression and also ensuring that we skip any intermediary layers ie No hidden layers, we should get the same results:
library(nnet)
Boston_tune_nnet <- e1071::tune.nnet(x = train[, -ncol(train)],
y = train$medv, size = 0, linout = TRUE,
skip = TRUE)
(nnet_tune_rmse <- Metrics::rmse(actual = train$medv,
predicted = Boston_tune_nnet$best.model$fitted.values))
#[1] 2.037201
This is what you are looking for
Also note that train[, 1:ncol(train) - 1]
is incorrect. You should do train[, 1:(ncol(train) - 1)]