rggplot2statisticsmixed-models

How to fit "Random Intercept-Fixed Slope" in ggplot2?


I have the following data:

# Create an empty data frame to store the simulated data
data <- data.frame(
  Lot = rep(1:num_lots, each = 9),
  Time = rep(3 * 0:8, times = num_lots),
  Purity = numeric(num_lots * 9)
)

# Simulate purity data for each lot and time point
for (lot in 1:num_lots) {
  # Generate random intercept and slope for each lot
  intercept <- rnorm(1, mean = 95, sd = 2)
  slope <- runif(1, min = -.7, max = 0)
  
  for (month in 0:8) {
    # Simulate purity data with noise
    data[data$Lot == lot &
           data$Time == month * 3, "Purity"] <-
      intercept + slope * month * 3 + rnorm(1, mean = 0, sd = .35)
  }
}

And I want to fit a mixed-effect model, with Random Intercept-Fixed Slope. I am using the following code which is not giving me what I need to get.

ggplot(data, aes(x = Time, y = Purity)) +
  geom_point(aes(color = as.factor(Lot), shape = as.factor(Lot))) +
  geom_smooth(
    method = "lm",
    formula = y ~ x  + (1 | Lot), data = data,
    se = FALSE,
    size = 0.5,
    show.legend = FALSE
  ) +
  labs(
    title = "Random Intercept and Fixed Slope ",
    x = "month",
    y = "Purity",
    color = "Lot",
    shape = "Lot"
  ) +
  theme_minimal() +
  scale_x_continuous(breaks = c(0, 3, 6, 9, 12, 15, 18, 21, 24))

An example of what I looking for looks like this

enter image description here

Question: What am I doing wrong?


Solution

  • There's a couple of things:

    1. You need to define the group in the aes() of the ggplot
    2. the lmer package is the tool for what you're after. It's actually used in the textbook you linked, below the charts.
        library(ggplot2)
        library(lme4)
        num_lots <- 3
        
        # Create an empty data frame to store the simulated data
        data <- data.frame(
          Lot = rep(1:num_lots, each = 9),
          Time = rep(3 * 0:8, times = num_lots),
          Purity = numeric(num_lots * 9)
        )
        
        # Simulate purity data for each lot and time point
        for (lot in 1:num_lots) {
          # Generate random intercept and slope for each lot
          intercept <- rnorm(1, mean = 95, sd = 2)
          slope <- runif(1, min = -.7, max = 0)
          
          for (month in 0:8) {
            # Simulate purity data with noise
            data[data$Lot == lot &
                   data$Time == month * 3, "Purity"] <-
              intercept + slope * month * 3 + rnorm(1, mean = 0, sd = .35)
          }
        }
        
        # Fit mixed-effects model
        model <- lmer(Purity ~ Time + (1 | Lot), data = data)
        
        # Predict values and add them to the dataframe
        data$Predicted <- predict(model)
        
        # Plot
        ggplot(data, aes(x = Time, y = Purity, color = as.factor(Lot), shape = as.factor(Lot))) +
          geom_point() +
          geom_line(aes(y = Predicted, group = Lot), size = 0.5) +
          labs(
            title = "Random Intercept and Fixed Slope",
            x = "Month",
            y = "Purity",
            color = "Lot",
            shape = "Lot"
          ) +
          theme_minimal() +
          scale_x_continuous(breaks = c(0, 3, 6, 9, 12, 15, 18, 21, 24))