I have two vectors of simulated data as follows:
x = rnorm(1000, mean = 0, sd = 1)
eps = rnorm(1000, mean = 0, sd = sqrt(0.25))
I am trying to use boot library's glm
and cv.glm
function to fit a linear regression model and multiple linear regression model with either leave one out cross-validation or k-fold cross-validation. The piece of code that I am using with the error I am getting is as follows:
> glm.fit=glm(y~x)
> cv.err=cv.glm(x, glm.fit)
Error in if ((K > n) || (K <= 1)) stop("'K' outside allowable range") :
missing value where TRUE/FALSE needed
I did check using is.na(x)
and confirmed that there are no null values present. Could anyone please suggest a solution for this or point out what am I doing wrong?
Thanks in advance.
For glm()
you can get x
and y
from the environment, but for cv.glm it has no access to these objects because it is running under another environment. Maybe check this post or this book chapter
If I run your code I get the same error:
library(boot)
set.seed(111)
x = rnorm(1000, mean = 0, sd = 1)
y = rnorm(1000, mean = 0, sd = sqrt(0.25))
glm.fit=glm(y~x)
cv.err=cv.glm(x, glm.fit)
Error in if ((K > n) || (K <= 1)) stop("'K' outside allowable range") :
missing value where TRUE/FALSE needed
If I put them into a data.frame it will work:
da = data.frame(x=x,y=y)
glm.fit=glm(y~x)
cv.err=cv.glm(da, glm.fit,K=5)
cv.err$delta
[1] 0.2428287 0.2426424