rggplot2nestedfacet-wrapggh4x

facet_nest_wrap display only on top level row and split in two column


I'm working on a very basic facet plot where I'm using ggh4x to attain a nested faceting scheme as per the figure below: test

I have two main issues with this figure, I wish to organize it in two columns showing hap1 and hap2 as well as having the top-level facet shown only once instead of having it repeated for every row all that without having to change the structure/shape of the nested barplots.

This is the code I'm using

library(readr)
library(ggh4x)
library(ggplot2)
library(RColorBrewer)

df_joint <- readr::read_tsv("path/to/samples_model3p_nest.tsv")
df_joint$strandness <- factor(df_joint$strandness, levels=c('+', '-'))

### Personalized stripes
color_strips <- strip_nested(
  background_x = elem_list_rect(fill=c(brewer.pal(12, "Set3")[c(7, 7, 7, 7, 6, 6, 6, 6, 9, 9, 9, 9, 
                                                               9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 
                                                               9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 
                                                               9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 
                                                               9, 9, 9, 9, 9, 9)])),
  text_x = elem_list_text(face=c("bold", "bold", "bold", "bold", "bold", "bold", "bold", "bold", 
                                 "plain", "plain", "plain", "plain", "plain", "plain", "plain", 
                                 "plain", "plain", "plain", "plain", "plain", "plain", "plain", 
                                 "plain", "plain", "plain", "plain", "plain", "plain", "plain", 
                                 "plain", "plain", "plain", "plain", "plain", "plain", "plain", 
                                 "plain", "plain", "plain", "plain", "plain", "plain", "plain", 
                                 "plain", "plain", "plain", "plain", "plain", "plain", "plain",
                                 "plain", "plain", "plain", "plain")),
  by_layer_x = FALSE
)
###NESTED FACET
both <- ggplot(df_joint, aes(as.factor(kpattern), fill=strandness)) +
  geom_bar() + theme_bw() + theme(axis.text.x=element_blank(), 
                                  axis.ticks.x=element_blank(),
                                  panel.grid.major=element_blank(),
                                  plot.title=element_text(face='bold', hjust=.5),
                                  legend.title=element_text(face='italic'), legend.position.inside=c(0.8,0.05)) +
  guides(fill=guide_legend(ncol=2, keywidth=1, position="inside")) +
  labs(x=expression(italic("k")*"-"*patterns~distribution)) +
  scale_fill_manual(values=c("+"="red", "-"="blue")) +
  coord_cartesian(ylim=c(0,125), expand=FALSE)
both + facet_nested_wrap(~haplotype + id, strip=color_strips)

dput can be challenging because the dataset is quite big, and every sample has many repeats. However, I'll add what I can if strictly necessary, thanks in advance!


Solution

  • Instead of ggh4x (which is an excellent package), we can split the data by hap#, use ggtext to fashion a title similar to a strip, and patchwork to combine them.

    Edited: (1) adjust strandedness to be A/B, now y is the column height; (2) bring the legend into the inside with some elbow-grease.

    Sample data:

    set.seed(42)
    dat <- expand.grid(type = c("hap1", "hap2"), subtype = c("20080", 200081, 200082, 200084, 200085, 200086, 200087, 200100, 200101, 200102, 200104, 200106, "NA12877", "NA12878", "NA12879", "NA12881", "NA12881", "NA12885", "NA12886", "NA12889", "NA12890", "NA12891", "NA12892"))
    dat <- transform(dat, kpattern = runif(nrow(dat)), y = 125*runif(nrow(dat)), strandedness = sample(c("A", "B"), nrow(dat), replace = TRUE))
    head(dat)
    #   type subtype  kpattern         y strandedness
    # 1 hap1   20080 0.9148060 110.96936            A
    # 2 hap2   20080 0.9370754  79.99735            B
    # 3 hap1  200081 0.2861395 121.37083            B
    # 4 hap2  200081 0.8304476  77.35478            B
    # 5 hap1  200082 0.6417455  41.67840            A
    # 6 hap2  200082 0.5190959  43.34353            B
    

    Plot code:

    library(patchwork)
    library(ggtext)
    strips <- c(hap1 = "#33aa33", hap2 = "#FFA500")
    design <- c(
      area(l=1, r=10, t=1, b=10),
      area(l=11, r=20, t=1, b=10),
      area(l=20, r=20, t=10, b=10)
    )
    split(dat, ~ type) |>
      lapply(function(X) {
        ggplot(X, aes(kpattern, y)) +
          facet_wrap(~ subtype) +
          geom_bar(aes(fill = strandedness), stat = "identity") +
          theme_bw() +
          theme(plot.title = element_textbox(color = "white", fill = strips[X$type[1]],
                                             halign = 0.5, width = unit(1, "npc"),
                                             padding = margin(2, 0, 1, 0))) +
          labs(
            title = X$type[1],
            x = expression(italic("k") * "-" * patterns ~ distribution)
          ) +
          scale_x_continuous(guide = "none")
      }) |>
      c(list(guide_area())) |>
      wrap_plots() +
      plot_layout(guides = "collect", axes = "collect", design = design)
    

    two-facet plot mimicking how facet-wrap handles two-level wrapping

    The use of guide_area() was inspired by Manually position legend in Patchwork, and pushing it inside the plots is handled with design= and the three area(..) (the third is for guide_area()). I'm not entirely sure offhand how to shift it around (perhaps a little left, for instance).