rhtml-table

Using t1flex with duplicate column names generated by table1


I am creating tables using table1 in a rmd document. To adjust the table width and the output format, I am using the flextable package. I am creating table1 tables with multiple levels, as in table1(~x|l1/l2, data = df). However, in some cases, the generated column headers are not unique (in case the count of observations with a level of l2 are equal in both/multiple levels of l1. Here's a minimal reproducible example:

library(table1)
library(flextable)
library(dplyr)
df <- data.frame(
  l1 = rep(c("A", "B"), each = 200),
  l2 = rep(rep(c("x", "y"), each = 100), 2),
  x = rnorm(400)
)
table1(~x | l1/l2, data = df) %>% t1flex() %>% flextable::autofit()

This will yield the error

Error in flextable(data) : duplicated col_keys: x(N=100), y(N=100)

which prevents me from using t1flex.

According to a github issue, this should have been fixed in an update to the package, however the problem prevails for me (and others, according to further comments on the issue).

Similar to a reply to this question, I tried to convert the table1 into a dataframe, change the duplicate column names by adding a "\r", and converting back to a table1 object. However, I can't recreate the multi-level column headers when generating table1 from a data.frame directly.

Solutions that would solve my questions are:

  1. Ways to ignore duplicate column names in t1flex()
  2. Ways to generate tables that look like multi-level table1() tables after converting to data.frame and manually renaming the column headers
  3. Functions/Packages that are visually consistent with the functionality of flextable
  4. Ways to directly and reliably change the html code without messing up the output of the flextable()

At this point I am willing to manually change the column names in the generated tables, so the solution (especially 2.) doesn't have to be general and context independent. As long as it gets me working tables, I will accept it as a solution.


Solution

  • If you look at the attributes and components of the object created by table1(.), there are two places where the repeated column names present themselves.

    tb <- table1(~x | l1/l2, data = df)
    str(tb)
    #  'table1' chr "<table class=\"Rtable1\">\n<thead>\n<tr>\n<th class=\"grouplabel\"></th>\n<th colspan=\"2\" class=\"grouplabel\"| __truncated__
    #  - attr(*, "html")= logi TRUE
    #  - attr(*, "obj")=List of 11
    #   ..$ contents    :List of 1
    #   .. ..$ : chr [1:3, 1:6] "" "0.159 (1.14)" "0.238 [-2.72, 2.56]" "" ...
    #   .. .. ..- attr(*, "dimnames")=List of 2
    #   .. .. .. ..$ : chr [1:3] "x" "Mean (SD)" "Median [Min, Max]"
    #   .. .. .. ..$ : chr [1:6] "x.A" "y.A" "x.B" "y.B" ...
    #   ..$ headings    : chr [1:2, 1:6] "x" "100" "y" "100" ...
    #   .. ..- attr(*, "dimnames")=List of 2
    #   .. .. ..$ : chr [1:2] "" "strat_n"
    #   .. .. ..$ : chr [1:6] "x.A" "y.A" "x.B" "y.B" ...
    #   ..$ labels      :List of 3
    #   .. ..$ strata   : Named chr [1:6] "x" "y" "x" "y" ...
    #   .. .. ..- attr(*, "names")= chr [1:6] "x.A" "y.A" "x.B" "y.B" ...
    #   .. ..$ variables:List of 1
    #   .. .. ..$ x: chr "x"
    #   .. .. .. ..- attr(*, "html")= chr "<span class='varlabel'>x</span>"
    #   .. ..$ groups   : chr [1:3] "A" "B" "Overall"
    #   ..$ topclass    : chr "Rtable1"
    #   ..$ ncolumns    : int 6
    #   ..$ groupspan   : Named int [1:3] 2 2 2
    #   .. ..- attr(*, "names")= chr [1:3] "A" "B" ""
    #   ..$ transpose   : logi FALSE
    #   ..$ rowlabelhead: chr ""
    #   ..$ caption     : NULL
    #   ..$ footnote    : NULL
    #   ..$ render.strat:function (label, n, transpose = F)  
    

    Within the "obj" attribute, we have a component of headings and labels$strata, both reference the duplicated names and also reference internal not-duplicate names (e.g., "x.A" and "x.B"):

    obj <- attr(table1(~x | l1/l2, data = df), "obj")
    obj$headings
    #         x.A   y.A   x.B   y.B   x.overall y.overall
    #         "x"   "y"   "x"   "y"   "x"       "y"      
    # strat_n "100" "100" "100" "100" "200"     "200"    
    obj$labels
    # $strata
    #       x.A       y.A       x.B       y.B x.overall y.overall 
    #       "x"       "y"       "x"       "y"       "x"       "y" 
    # $variables
    # $variables$x
    # [1] "x"
    # attr(,"html")
    # [1] "<span class='varlabel'>x</span>"
    # $groups
    # [1] "A"       "B"       "Overall"
    

    One technique I use periodically is to add spaces to the duplicate names. This works particularly well when the rendering medium will trim that blank space automatically.

    pad <- function(x) paste0(x, strrep(" ", seq_along(x) - 1))
    
    split(obj$headings[1,], obj$headings[1,]) <-
      split(obj$headings[1,], obj$headings[1,]) |> lapply(pad)
    obj$headings
    #         x.A   y.A   x.B   y.B   x.overall y.overall
    #         "x"   "y"   "x "  "y "  "x  "     "y  "    
    # strat_n "100" "100" "100" "100" "200"     "200"    
    
    split(obj$labels$strata, obj$labels$strata) <-
      split(obj$labels$strata, obj$labels$strata) |> lapply(pad)
    obj$labels$strata
    #       x.A       y.A       x.B       y.B x.overall y.overall 
    #       "x"       "y"      "x "      "y "     "x  "     "y  " 
    

    Now it works:

    attr(tb, "obj") <- obj
    t1flex(tb) %>% flextable::autofit()
    

    table1 via flextable