ricc

Illustrating different ICC values


I'd like to illustrate ICC. I'd love to create a plot where one can see, that with a higher ICC more of the variability in a dependent variable can be attributed to the groups.

I've stumble across this depiction (https://bookdown.org/marklhc/notes_bookdown/hierarchical-multilevel-models.html#multilevel-modeling-mlm):

ICC

And I think it does a great job explaining this fact.

Now I've tried to recreate the spirit of this plot with the following code:

ni <- 30
#numbers of student per class

nj <- 15
#numbers of clusters/classes

ICC <- seq(.1,.9,.1) 
#starting with 0, then going up .1 stepwise till .9, thus generating 10 values

# what is the random intercept standard deviation given icc
RI_sd <- sqrt(1 / (1-ICC) - 1) 
# RI_sd

# -------------------------------------------------------------------------

  dgp_grid <- expand.grid(
    ni = 1:ni,
    nj = 1:nj,
    x = NA,
    y = NA, # with two predictor variables
    RI_sd = 0
  )
  
  dgp_grid$U0j <- rep(rnorm(nj, mean = 0, sd = RI_sd), each = ni)
  #create level 2 residual (ICC)
  
  dgp_grid$x <- rnorm(ni*nj,mean = 0, sd = 1)
  # create level 1 x explanatory/predictor variable (draw from standard normal)
  
  dgp_grid$y <-
    dgp_grid$x +
    dgp_grid$U0j 

# -------------------------------------------------------------------------
  library(ggplot2)

  # Define the desired range for the y-axis
  x_axis_limit <- c(-6, 6)  
  
  # Plot the data with fixed y-axis range
  ggplot(dgp_grid, aes(x = y, y = nj)) +
    geom_jitter(aes(color = factor(nj)), size = 3, width = 0.2) +
    geom_boxplot(aes(group = nj), alpha = 0.5, color = "black", width = 0.2) +
    theme_minimal() +
    coord_flip() +
    xlim(x_axis_limit)

I basically create some clusters nj (one could think of them as classes in, e.g., a school) and then try to weave in a residual term for the ICC. No If I want to have no ICC I would set RI_sd = 0, and if I want a very big ICC I would set RI_sd = 3.

This is what the plot looks like with RI_sd = 0: enter image description here

And this is waht it looks like with RI_sd = 3: enter image description here

I have two questions.

  1. Is my implementation of ICC correct? and
  2. can I make the effect of the ICC more profound so that is more like to the first picture?

Solution

  • I think it would be best to write a function that takes ICC and outputs a plot similar in style to the linked example. We can even get it to take a vector of ICC values and output a panel for each value.

    show_icc <- function(icc = 0.5) {
      
      # These could be function parameters rather than constants if preferred.
      group_mean    <- 5  # This is arbitrary as there are no numbers on y axis
      group_sd      <- 1  # This is also arbitrary
      n_groups      <- 10 # Number of groups along x axis
      obs_per_group <- 10 # Number of points per group
      
      # This gives us a vector of means, one for each group
      mu <- rnorm(n_groups, group_mean, group_sd)
      
      dat <- lapply(icc, function(icc) {
        # Calculate sd within group from icc then generate group samples
        sigma <- group_sd^2/icc - group_sd^2
        replicate(obs_per_group, rnorm(n_groups, mu, sigma)) |>
        t() |>
        as.data.frame() |>
        stack() |>
        setNames(c('value', 'group')) |>
        within(icc <- paste('ICC = ', icc))
      })
      
      do.call('rbind', dat) |>
        ggplot(aes(group, value)) +
        geom_point(color = 'gray') +
        stat_summary(fun = mean, geom = 'point', 
                     shape = 24, fill = 'red', size = 4) +
        facet_wrap(.~icc, scales = 'free_y') +
        theme_classic() +
        theme(strip.background = element_blank(),
              strip.text = element_text(size = 14, face = 2),
              axis.text.y = element_blank())
    }
    

    This allows

    show_icc(c(0.1, 0.7, 0.9))
    

    enter image description here