Background and What I have Tried
Suppose I have a model with 2 numeric covariates and 1 categorical covariate with 3 factor levels for a response:
set.seed(1)
Y <- sample(100)
n <- 100
X1 <- sample(n)
X2 <- sample(n)
X3 <- as.factor(rep(c("A", "B", "C", "D"), n/4))
model <- lm(Y ~ X1 + X2 + X3)
If I use the anova
function, I get the following result:
> anova(model)
Analysis of Variance Table
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
X1 1 1029 1028.85 1.1896 0.2782
X2 1 645 645.41 0.7462 0.3899
X3 3 351 116.87 0.1351 0.9389
Residuals 94 81300 864.89
I can also use the aov
and summary
functions to obtain a similar result:
model_aov <- aov(Y ~ X1 + X2 + X3)
> anova(model_aov)
Analysis of Variance Table
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
X1 1 1029 1028.85 1.1896 0.2782
X2 1 645 645.41 0.7462 0.3899
X3 3 351 116.87 0.1351 0.9389
Residuals 94 81300 864.89
Desired Result
I would like to aggregate all covariates into a single line for the Sum Square Regression (SSR).
Is it possible to obtain an ANOVA table which will look like the following:
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
X1 + X2 + X3 5 2025 405 f p
Residuals 94 81300 864.89
where f and p are calculated within R?
This can be done by using the design matrix in the linear model.
model <- lm(Y ~ X1 + X2 + X3)
design_matrix = model.matrix(model)
The anova
command will now give the following output
> anova(lm(Y ~ design_matrix))
Analysis of Variance Table
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
design_matrix 5 2025 404.98 0.4682 0.799
Residuals 94 81300 864.89