rggplot2ggallyggpairs

Get ggpairs to show plots of all predictors against a single response


I am trying to figure out a way to use ggpairs to hack my way to simpler exploratory data analysis. My real data set has ~50 predictors and a single response, so I can't just do a standard scatterplot matrix with ggpairs. Thanks to this question, I have figured out how to plot only the top row of my ggpairs plot (minus the diagonal density plot of the response):

library(ggplot2)
library(GGally)
data(mtcars)
mtcars$vs <- as.factor(mtcars$vs)
mtcars$am <- as.factor(mtcars$am)
mtcars$gear <- as.factor(mtcars$gear)
mtcars$carb <- as.factor(mtcars$carb)
mtcars$cyl <- as.factor(mtcars$cyl)

primary_var <- "mpg"
pairs <- ggpairs(mtcars, columns = c(1:11), 
                 upper = list(continuous = "points", combo = "box_no_facet", discrete = "count", na = "na"),
                 lower = list(continuous = "cor", combo = "facethist", discrete = "facetbar", na = "na"))
pvar_pos <- match(primary_var, pairs$yAxisLabels)
plots <- lapply(2:pairs$ncol, function(j) getPlot(pairs, i = pvar_pos, j = j))
ggmatrix(
  plots,
  nrow = 1,
  ncol = pairs$ncol-1,
  xAxisLabels = pairs$xAxisLabels[-1],
  yAxisLabels = primary_var
)

a ten-panel figure (1 row, 10 col) of scatterplots and boxplots showing mpg on the y-axis and various predictors from mtcars on the x-axis

This is quite good, but I'd love to turn this into a 2x5 matrix of plots rather than a 1x10 matrix (scale this up to my ~50 predictors to see why). This seems a possibility since there are arguments for nrow and ncol (with no default), but having the panels wrap around like this seems incompatible with allowing them to be labeled with the striptitles:

ggmatrix(
  plots,
  nrow = 2,
  ncol = 5,
  xAxisLabels = pairs$xAxisLabels[-1],
  yAxisLabels = primary_var
)
Error in pmg$grobs[[grob_pos]] <- axis_panel : 
  attempt to select more than one element in integerOneIndex

If we remove the arguments xAxisLabels and yAxisLabels then it kind of does what I want, but without labels, and ignores the proper x-axis scales for the top row....

ggmatrix(
  plots,
  nrow = 2,
  ncol = 5,
)

a ten-panel figure (2 row, 5 col) of scatterplots and boxplots with no axis labels

Is there a way to both wrap the set of figures and retain the labeling of the variables and axes?


Solution

  • One potential solution would be to split your list into n rows, then 'paste' them together using the patchwork package, e.g.

    library(tidyverse)
    library(GGally)
    library(patchwork)
    
    data(mtcars)
    mtcars$vs <- as.factor(mtcars$vs)
    mtcars$am <- as.factor(mtcars$am)
    mtcars$gear <- as.factor(mtcars$gear)
    mtcars$carb <- as.factor(mtcars$carb)
    mtcars$cyl <- as.factor(mtcars$cyl)
    
    primary_var <- "mpg"
    pairs <- ggpairs(mtcars, columns = c(1:11), 
                     upper = list(continuous = "points", combo = "box_no_facet", discrete = "count", na = "na"),
                     lower = list(continuous = "cor", combo = "facethist", discrete = "facetbar", na = "na"))
    pvar_pos <- match(primary_var, pairs$yAxisLabels)
    plots <- lapply(2:pairs$ncol, function(j) getPlot(pairs, i = pvar_pos, j = j))
    
    
    plot_ggpairs_primary_vars <- function(plots = plots, n_rows = NULL, n_cols = NULL) {
      if (is.null(n_rows) | is.null(n_cols)) {
        stop("n_rows and n_cols must be specified")
      }
      lst <- split(plots, 1:n_rows)
      output <- list()
      for (i in 1:length(lst)) {
        output[[i]] <- wrap_elements(ggmatrix_gtable(ggmatrix(
          lst[[i]],
          nrow = 1,
          ncol = n_cols,
          xAxisLabels = unlist(map(1:n_cols, ~pluck(lst[[i]], .x, "labels", "x"))),
          yAxisLabels = primary_var
        )))
      }
      wrap_plots(output, nrow = n_rows)
    }
    
    plot_ggpairs_primary_vars(plots, n_rows = 2, n_cols = 5)
    

    2 by 5 pairs plot

    plot_ggpairs_primary_vars(plots, n_rows = 5, n_cols = 2)
    

    5 by 2 pairs plot

    plot_ggpairs_primary_vars(plots, n_rows = 3, n_cols = 5)
    #> Warning in split.default(plots, 1:n_rows): data length is not a multiple of
    #> split variable
    

    uneven number pairs plot

    Created on 2024-12-20 with reprex v2.1.0