I am well aware of how to plot the standard error of a regression using ggplot
. As an example with the iris
dataset, this can be easily done with this code:
library(tidyverse)
iris %>%
ggplot(aes(x=Sepal.Width,
y=Sepal.Length))+
geom_point()+
geom_smooth(method = "lm",
se=T)
I also know that a regression using base R scatterplots can be achieved with this code:
#### Scatterplot ####
plot(iris$Sepal.Width,
iris$Sepal.Length)
#### Fit Regression ####
fit <- lm(iris$Sepal.Length ~ iris$Sepal.Width)
#### Fit Line to Plot ####
abline(fit, col="red")
However, I've tried looking up how to plot standard error in base R scatterplots, but all I have found both on SO and Google is how to do this with error bars. However, I would like to shade the error in a similar way as ggplot
does above. How can one accomplish this?
To manually obtain the standard error of this regression, I believe you would calculate it like so:
#### Derive Standard Error ####
fit <- lm(Sepal.Length ~ Sepal.Width,
iris)
n <- length(iris)
df <- n-2 # degrees of freedom
y.hat <- fitted(fit)
res <- resid(fit)
sq.res <- res^2
ss.res <- sum(sq.res)
se <- sqrt(ss.res/df)
So if this may allow one to fit it into a base R plot, I'm all ears.
Here's a slightly fiddly approach using broom::augment
to generate a dataset with predictions and standard errors. You could also do it in base R with predict
if you don't want to use broom
but that's a couple of extra lines.
Note: I was puzzled as to why the interval in my graph are narrower than your ggplot
interval in the question. But a look at the geom_smooth
documentation suggests that the se=TRUE
option adds a 95% confidence interval rather than +-1 se as you might expect. So its probably better to generate your own intervals rather than letting the graphics package do it!
#### Fit Regression (note use of `data` argument) ####
fit <- lm(data=iris, Sepal.Length ~ Sepal.Width)
#### Generate predictions and se ####
dat <- broom::augment(fit, se_fit = TRUE)
#### Alternative using `predict` instead of broom ####
dat <- cbind(iris,
.fitted=predict(fit, newdata = iris),
.se.fit=predict(fit, newdata = iris, se.fit = TRUE)$se.fit)
#### Now sort the dataset in the x-axis order
dat <- dat[order(dat$`Sepal.Width`),]
#### Plot with predictions and standard errors
with(dat, {
plot(Sepal.Width,Sepal.Length)
polygon(c(Sepal.Width, rev(Sepal.Width)), c(.fitted+.se.fit, rev(.fitted-.se.fit)), border = NA, col = hsv(1,1,1,0.2))
lines(Sepal.Width,.fitted, lwd=2)
lines(Sepal.Width,.fitted+.se.fit, col="red")
lines(Sepal.Width,.fitted-.se.fit, col="red")
})