rggplot2lm

Linear regression with geom_smooth() from ggplot2 with categorical variable on the x-axis


I try to plot a linear regression line to a scatterplot with an ordinal categorical variable on the x-axis. However, no line is plotted with my code (there is also no warning or error). Is it possible to plot a regression line for categorical x-axes? Probably I have to specify the order of the levels of group? Setting the levels explicitly did not help.

# libraries
library(ggplot2)

# dummy data
dat <- data.frame(group = as.factor(c(rep("A", 10), rep("B", 10))),
                  variable = c(rnorm(10, mean = 3), rnorm(10, mean = 12)))

# specify order of levels explicitly
dat$group <- factor(dat$group, levels = c("A", "B"))

# plot
ggplot(dat, aes(x = group, y = variable)) +
  geom_point() +
  geom_smooth(method = "lm")

Output:

enter image description here


Solution

  • geom_smooth can only smooth over a continuous variable, and will try to create a smooth for each level of a discrete variable. This usually will not work when the discrete variable is mapped to the x- or y-axes. But we can work around ggplot's default behavior by specifying geom_smooth(aes(group = 1)), which will override the default grouping behavior.

    ggplot(dat, aes(x = group, y = variable)) +
      geom_point() +
      geom_smooth(aes(group = 1), method = "lm")
    

    enter image description here