rtibble

Create a crosstable with integer and percentages together in the same cell


I filter humans in the starwars dataset. Using the ‘sex’ and ‘skin_color’ columns in the humans dataset, I want to create the crosstable as follows:

enter image description here

my trial:

# Load the starwars dataset 
library(janitor)
library(dplyr)
library(tibble)
starwars_data <- as_tibble(starwars)

# Subset the humans from the starwars dataset
humans_data <- starwars_data %>%
  filter(species == "Human")

# Create the crosstab
crosstab <- table(humans_data$sex, humans_data$skin_color)

# Add row and column totals
crosstab <- addmargins(crosstab)

# Calculate row percentages
row_percentages <- prop.table(crosstab, margin = 1) * 100

# Combine the crosstab and row percentages
crosstab_with_percentages <- cbind(crosstab, row_percentages)

# Print the result
print(crosstab_with_percentages)

dark fair light none pale tan white Sum     dark     fair     light     none     pale      tan    white
female    0    3     5    1    0   0     0   9 0.000000 16.66667 27.777778 5.555556 0.000000 0.000000 0.000000
male      4   13     5    0    1   2     1  26 7.692308 25.00000  9.615385 0.000000 1.923077 3.846154 1.923077
Sum       4   16    10    1    1   2     1  35 5.714286 22.85714 14.285714 1.428571 1.428571 2.857143 1.428571
       Sum
female  50
male    50
Sum     50

Solution

  • You could do this as a pedestrian data-wrangling task in tidyverse:

    library(tidyverse)
    
    starwars %>%
      filter(species == "Human") %>%
      with(table(sex, skin_color)) %>%
      as.data.frame() %>%
      bind_rows(
        bind_cols(tibble(sex = "Total"),
                  summarise(., Freq = sum(Freq), .by = skin_color))) %>%
      mutate(prop = Freq/sum(Freq), .by = sex) %>%
      mutate(prop = paste0(Freq, " (", scales::percent(prop, 1), ")")) %>%
      select(-Freq) %>%
      pivot_wider(names_from = skin_color, values_from = prop) %>%
      rowwise() %>%
      mutate(Sum = sum(sapply(strsplit(c_across(-1), " "),
                              \(x) as.numeric(x[1])))) %>%
      mutate(Sum = paste(Sum, "(100%)")) %>%
      rename(`sex / skin_color` = sex) %>%
      as.data.frame(check.names = FALSE)
    #>   sex / skin_color    dark     fair    light   pale    tan  white       Sum
    #> 1           female  0 (0%)  3 (33%)  6 (67%) 0 (0%) 0 (0%) 0 (0%)  9 (100%)
    #> 2             male 4 (15%) 13 (50%)  5 (19%) 1 (4%) 2 (8%) 1 (4%) 26 (100%)
    #> 3            Total 4 (11%) 16 (46%) 11 (31%) 1 (3%) 2 (6%) 1 (3%) 35 (100%)
    

    Created on 2024-02-01 with reprex v2.0.2