rlinear-regressionbootstrapping

Error in eval(predvars, data, env) when doing Bootstrapping


I am trying to fit a linear regression mode to examine the impact of COVID on traffic volume. Here is my dataset that have 31 observations which I have normalised using min-max scaling.

structure(list(Date = c("Jul-19", "Aug-19", "Sep-19", "Oct-19", 
"Nov-19", "Dec-19", "Jan-20", "Feb-20", "Mar-20", "Apr-20", "May-20", 
"Jun-20", "Jul-20", "Aug-20", "Sep-20", "Oct-20", "Nov-20", "Dec-20", 
"Jan-21", "Feb-21", "Mar-21", "Apr-21", "May-21", "Jun-21", "Jul-21", 
"Aug-21", "Sep-21", "Oct-21", "Nov-21", "Dec-21", "Jan-22"), 
    Average.Weekday.Daily.Traffic..Veh.day. = c(0.888327194565519, 
    0.957155270369628, 0.895880165623604, 0.942066917846049, 
    1, 0.877361842545003, 0.703731456172012, 0.898082382615854, 
    0.607783746512863, 0, 0.317277553217312, 0.560830088053501, 
    0.706464878860476, 0.747561202987416, 0.737478848514911, 
    0.800118905645907, 0.857496156674019, 0.809655982748128, 
    0.708357519023144, 0.926517717996616, 0.895669090512526, 
    0.856018630896471, 0.925733222167108, 0.825782121234508, 
    0.798338838875814, 0.533903939716948, 0.819263418220707, 
    0.84537340946109, 0.792752384269276, 0.64987564158039, 0.365423786054267
    ), Case = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0.00543532109442455, 
    0.00036093929142663, 5.66179280669224e-05, 0, 0.000134467579158941, 
    0.000339707568401534, 0.000127390338150575, 0.000113235856133845, 
    0, 0.000169853784200767, 0.000106158615125479, 0.00106158615125479, 
    0.0015286840578069, 0, 0, 0.00110404959730499, 0.00312814052569746, 
    0, 0.000268935158317881, 0, 0.00153576129881527, 1), Regulation = c(0, 
    0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0.666666666666667, 0.333333333333333, 
    0.333333333333333, 0.166666666666667, 0.166666666666667, 
    0.166666666666667, 0.166666666666667, 0.333333333333333, 
    0.166666666666667, 0.333333333333333, 0.166666666666667, 
    0.166666666666667, 0.333333333333333, 0.166666666666667, 
    0.166666666666667, 0.166666666666667, 0.166666666666667, 
    0, 0, 0)), row.names = c(NA, -31L), class = "data.frame")

My model and output:

m1 <- lm(Average.Weekday.Daily.Traffic..Veh.day. ~ Case + Regulation, data = data)
summary(m1)

Call:
lm(formula = Average.Weekday.Daily.Traffic..Veh.day. ~ Case + 
    Regulation, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.35325 -0.00991  0.00712  0.05537  0.25166 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.89096    0.03061  29.104  < 2e-16 ***
Case        -0.52838    0.13368  -3.952 0.000477 ***
Regulation  -0.53484    0.08077  -6.622 3.49e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1301 on 28 degrees of freedom
Multiple R-squared:  0.6548,    Adjusted R-squared:  0.6301 
F-statistic: 26.55 on 2 and 28 DF,  p-value: 3.413e-07

Although the independent variables are significant but I found that errors are not normally distributed using Jarque-Bera test. I tried to fix it by doing Bootstrapping with the following codes:

set.seed(4321)
m2 <- Boot(m1, R = 100, method = "residual")
summary(m2)

But I received an error message even though I have included Case variable in my linear model: Error in eval(predvars, data, env) : object 'Case' not found

Could anyone please help explain for me why is it giving this error and how to fix it? Thank you so much for your help.


Solution

  • Your problem is due to a name clash between the name of your data frame and the function data, due to the way Boot and boot use non-standard evaluation. This is why it is best to avoid such names for your own objects. All you need to do is change data to mydata and the problem disappears.

    library(car)
    #> Loading required package: carData
    
    mydata <- data
    m1 <- lm(Average.Weekday.Daily.Traffic..Veh.day. ~ Case + Regulation, data = mydata)
    
    set.seed(4321)
    m2 <- Boot(m1, R = 100, method = "residual")
    #> Loading required namespace: boot
    summary(m2)
    #> 
    #> Number of bootstrap replications R = 100 
    #>             original    bootBias   bootSE  bootMed
    #> (Intercept)  0.89096  0.00248002 0.038248  0.89308
    #> Case        -0.52838  0.00555811 0.187270 -0.51458
    #> Regulation  -0.53484 -0.00088829 0.099065 -0.53900
    

    Created on 2022-04-03 by the reprex package (v2.0.1)