rlatexsummaryxtable

R to Latex summary table for categorical variables by year


year <- c(2000,2000,2000,2001,2001,2001)
gender <- c("F","M","M","F","F","M")
grade <- c("A","B","C","C","B","A")
df <- data.frame(year,gender,grade)

I would like to make a summary table but minimize manual codes and automates the process as much as possible. In my project, I have 170 variables to summarise. I tried tidyverse group by but didn't get the result that I want. I will use xtable to move to latex file. (I tried add.to.row but failed to add "Gender" on the first line.)

Here is the outcome that I want.

enter image description here

Please help me to draw this table. I need the variable names in the table.


Solution

  • You can use pivot_longer and summarise to generate the summary values.

    library(tidyverse)
    
    df %>% 
      pivot_longer(-year) %>% 
      group_by(year, name, value) %>% 
      summarise(n = n()) %>% 
      mutate(prop = round(n / sum(n), 3) * 100)
    
    # A tibble: 10 x 5
    # Groups:   year, name [4]
        year name   value     n  prop
       <dbl> <chr>  <chr> <int> <dbl>
     1  2000 gender F         1  33.3
     2  2000 gender M         2  66.7
     3  2000 grade  A         1  33.3
     4  2000 grade  B         1  33.3
     5  2000 grade  C         1  33.3
     6  2001 gender F         2  66.7
     7  2001 gender M         1  33.3
     8  2001 grade  A         1  33.3
     9  2001 grade  B         1  33.3
    10  2001 grade  C         1  33.3
    

    You can also get closer to your desired table by joining values in a formatted string, and then using pivot_wider:

    df %>% 
      pivot_longer(-year) %>% 
      group_by(year, name, value) %>% 
      summarise(n = n()) %>% 
      mutate(prop = round(n / sum(n), 3) * 100,
             summary_str = glue::glue("{n}({prop}%)")) %>% 
      pivot_wider(id_cols = c(name, value), names_from = "year", 
                  values_from = "summary_str") 
    
      name   value `2000`   `2001`  
      <chr>  <chr> <glue>   <glue>  
    1 gender F     1(33.3%) 2(66.7%)
    2 gender M     2(66.7%) 1(33.3%)
    3 grade  A     1(33.3%) 1(33.3%)
    4 grade  B     1(33.3%) 1(33.3%)
    5 grade  C     1(33.3%) 1(33.3%)