I have the following data table. I have visualized it using ggplot2. I'm wondering when y = 1, what are the respective x-axis values corresponding to the lower boundary of the standard error (se) and the regression curve?
library(tidyverse)
set.seed(114)
tableInput <- tibble(
x = sample(60:270,100,replace = T) %>% sort(decreasing = F),
y = rnorm(100,1,1) %>% abs(),
)
ggplot(tableInput, aes(x,y)) +
geom_smooth(method = lm) +
geom_hline(yintercept = 1) +
scale_x_continuous(n.breaks = 10) +
coord_cartesian(ylim = c(.5,2),xlim = c(50,280)) +
theme_classic()
A solution in base R is to use optimize
to find the value of x
that gives the desired value of y
.
We start by writing a function that will be at a minimum when the regression line is at y = 1
.
f1 <- function(val) {
mod <- lm(y ~ x, tableInput)
pred <- predict(mod, newdata = data.frame(x = val), se = TRUE)
(1 - pred$fit)^2
}
We can do the same for the lower border of the SE ribbon as follows:
f2 <- function(val) {
mod <- lm(y ~ x, tableInput)
pred <- predict(mod, newdata = data.frame(x = val), se = TRUE)
(1 - (pred$fit + qnorm(0.025) * pred$se.fit))^2
}
Now we can use optimize
to automatically find the correct x values:
x_regress <- optimize(f1, c(50, 275))$minimum
x_lower <- optimize(f2, c(50, 275))$minimum
x_regress
#> [1] 208.0677
x_lower
#> [1] 157.8854
And we can confirm these are correct by adding them as plot annotations:
ggplot(tableInput, aes(x,y)) +
geom_smooth(method = lm) +
geom_hline(yintercept = 1) +
scale_x_continuous(n.breaks = 10) +
annotate('point', x = c(x_regress, x_lower), y = 1, col = 'red') +
coord_cartesian(ylim = c(.5,2),xlim = c(50,280)) +
theme_classic()