I'm trying to replicate the results from An Introduction to Statistical Learning with Applications in R. Specifically, the Lab in section 6.5.3. I have followed the code in the lab exactly:
library("ISLR")
library("leaps")
set.seed(1)
train = sample(c(TRUE, FALSE), nrow(Hitters), rep = TRUE)
test = (!train)
regfit.best = regsubsets(Salary ~., data = Hitters[train,], nvmax= 19)
test.mat = model.matrix(Salary~., data = Hitters[test,])
val.errors = rep(NA, 19)
for (i in 1:19){
coefi= coef(regfit.best, id = i)
pred=test.mat[,names(coefi)]%*%coefi
val.errors[i]=mean((Hitters$Salary[test]-pred)^2)
}
When I run this I still get the following error:
Warning message:
In Hitters$Salary[test] - pred :
longer object length is not a multiple of shorter object length
Error in mean((Hitters$Salary[test] - pred)^2) :
error in evaluating the argument 'x' in selecting a method for function 'mean': Error: dims [product 121] do not match the length of object [148]
And val.errors is a vector of 19 NAs.
I'm still relatively new to R and to the validation approach, so I'm not sure exactly why the dimensions of these are different.
It was actually an issue with not carrying over steps from the previous subsection, which omitted entries that were incomplete.