rflextable

In R flextable how can I add a break between column headings


I have a contingency table that has nested column headings and I need help adding space between the column headings. There are three histologies and inside each of them are three stages. I would like to have breaks between each of the three groups of stage. I got this far:

Cancer <- read.table("http://users.stat.ufl.edu/~aa/cat/data/Cancer.dat",
                     header = TRUE, stringsAsFactors = TRUE)
library(dplyr)
library(tidyr)
cancerCountWide <- Cancer %>% 
  select(-risktime) %>% 
  pivot_wider(id = time, names_from = c(histology, stage), 
              values_from=count) %>% 
  mutate(blank = " ") %>%
  mutate(blank2 = " ") %>%
  select(time, blank, `1_1`, `2_1`, `3_1`, blank2, everything())


my_header <- data.frame(
  col_keys = c("time", "blank", "1_1", "2_1", "3_1", "blank", "1_2", "2_2", "3_2", "1_3", "2_3","3_3"),
  line2 = c("Follow-up", "Histology", rep("I", 3), "blank", rep("II", 3), rep("III", 3)),
  line3 = c("Follow-up", "Disease Stage", c(1,2,3), "blank", rep(c(1,2,3),2))
)

library(flextable)
flextable(cancerCountWide) %>% 
  set_header_df(
    mapping = my_header,
    key = "col_keys"
  ) %>% 
  theme_booktabs() %>% 
  merge_v(part = "header") %>% 
  merge_h(part = "header") %>% 
  align(align = "center", part = "all") %>% 
  autofit() %>% 
  empty_blanks()

Which gets me close:

enter image description here

Adding in the blank column(s) is not very elegant and the thin line below the stages is not good. I am thinking there has to be a better way to do this....

Can someone please suggest how I can add a bit of padding between the stage groups? I am brand new to flextable. So suggestions for tutorials for learning would also be appreciated.


Solution

  • I would do:

    Cancer <- read.table("http://users.stat.ufl.edu/~aa/cat/data/Cancer.dat",
                         header = TRUE, stringsAsFactors = TRUE)
    library(dplyr)
    library(tidyr)
    library(scales)
    library(flextable)
    
    cancerCountWide <- Cancer %>% 
      select(-risktime) %>% 
      pivot_wider(id_cols = time, names_from = c(histology, stage), 
                  values_from=count) %>%
      mutate(`histo` = " ") %>% 
      select(time, histo, `1_1`, `2_1`, `3_1`, everything())
    
    
    my_header <- data.frame(
      col_keys = c("time", "histo", "blank1", "1_1", "2_1", "3_1", "blank2", "1_2", "2_2", "3_2", "blank3", "1_3", "2_3","3_3"),
      line2 = c("Follow-up", "Histology", "", rep("I", 3), "", rep("II", 3), "", rep("III", 3)),
      line3 = c("Follow-up", "Disease Stage", rep(c("", "1", "2", "3"), 3))
    )
    
    flextable(cancerCountWide, col_keys = my_header$col_keys) %>% 
      set_header_df(
        mapping = my_header,
        key = "col_keys"
      ) %>% 
      theme_booktabs() %>% 
      merge_v(part = "header") %>% 
      merge_h(part = "header") %>% 
      align(align = "center", part = "all") %>% 
      autofit() %>% 
      empty_blanks() %>% 
      fix_border_issues()
    

    enter image description here

    Make sure my_header contains all col_keys descriptions and use col_keys argument to get blank columns (if one of col_keys does not exist in the original dataset, it's a fake/blank column).

    > my_header
       col_keys     line2         line3
    1      time Follow-up     Follow-up
    2     histo Histology Disease Stage
    3    blank1                        
    4       1_1         I             1
    5       2_1         I             2
    6       3_1         I             3
    7    blank2                        
    8       1_2        II             1
    9       2_2        II             2
    10      3_2        II             3
    11   blank3                        
    12      1_3       III             1
    13      2_3       III             2
    14      3_3       III             3
    

    You can find the documentation here: https://davidgohel.github.io/flextable/