rlabelpie-chart

Formatting and organizing labels in R-generated 3-level pie-donut chart


enter image description hereI have the following data set, some R code I generated to plot a 3-level nested pie-donut chart, and the resulting plot.

Family Genus Species Total
Asteraceae Balsamorhiza sagittata 1
Asteraceae Balsamorhiza sp. 2
Caprifoliaceae Symphoricarpos sp. 1
Cupressaceae Juniperus sp. 1
Cyperaceae Carex sp. 2
Fabaceae Medicago sp. 6
Poaceae Gen. sp. 13
Poaceae Agropyron sp. 22
Poaceae Agropyron cristatum 2
Poaceae Agropyron desertorum 2
Poaceae Bromus inermis 2
Poaceae Elymus sp. 2
Poaceae Festuca altaica 2
Poaceae Festuca sp. 1
Poaceae Leymus cinereus 2
Poaceae Poa pratensis 7
Rosaceae Potentilla sp. 2
Poaceae Salix sp. 2
# Load necessary libraries
library(ggplot2)
library(dplyr)
library(viridis)

# Read the data
data <- read.csv("Book1.csv")

# Aggregate data by Family, Genus, and Species
family_count <- data %>%
  group_by(Family) %>%
  summarise(Total = sum(Total), .groups = 'drop') %>%
  mutate(Level = "Family")

genus_count <- data %>%
  group_by(Family, Genus) %>%
  summarise(Total = sum(Total), .groups = 'drop') %>%
  mutate(Level = "Genus")

species_count <- data %>%
  group_by(Family, Genus, Species) %>%
  summarise(Total = sum(Total), .groups = 'drop') %>%
  mutate(Level = "Species")

# Combine the data for all three levels
combined_data <- bind_rows(
  family_count %>% mutate(Group = Family),
  genus_count %>% mutate(Group = paste(Family, Genus, sep = " - ")),
  species_count %>% mutate(Group = paste(Family, Genus, Species, sep = " - "))
)

# Assign each level to a different radius
combined_data <- combined_data %>%
  mutate(radius = case_when(
    Level == "Family" ~ 3,
    Level == "Genus" ~ 2,
    Level == "Species" ~ 1
  ))

# Create the pie-donut chart
ggplot(combined_data, aes(x = factor(radius), y = Total, fill = Group)) +
  geom_bar(stat = "identity", width = 1, color = "white") +
  coord_polar(theta = "y") +
  scale_fill_viridis(discrete = TRUE, option = "D") +
  theme_void() +
  theme(legend.position = "none") +
  geom_text(aes(label = Group), position = position_stack(vjust = 0.5), size = 2.5) +
  annotate("text", x = 0, y = 0, label = "Labops", size = 6, fontface = "bold", color = "black") +
  annotate("rect", xmin = -0.5, xmax = 0.5, ymin = -0.5, ymax = 0.5, fill = "white")

As the figure of the output shows, the middle and inner layers have the full Group names, which were used to help ensure the nested levels are aligned. I am struggling to find a solution that would remove the, e.g., "Poaceae - " from the middle and "Poaceae - Agropyron - " from the inner levels, without causing those layers to become unaligned with the outer levels. I would also prefer to have the Total for those slices included in the labels too. There are other things I'd like to fix, like centering "Labops", narrowing that really wide white gap at top and have labels that don't fit nicely in their slices to be plotted outside the chart and pointed to their slices by a line, but my main concern (and the only thing I'm expecting assistance with) is the format of these labels.

As requested from a comment on the post, here is 'dput':

> dput(data)
structure(list(Family = c("Asteraceae", "Asteraceae", "Caprifoliaceae", 
"Cupressaceae", "Cyperaceae", "Fabaceae", "Poaceae", "Poaceae", 
"Poaceae", "Poaceae", "Poaceae", "Poaceae", "Poaceae", "Poaceae", 
"Poaceae", "Poaceae", "Rosaceae", "Salicaceae"), Genus = c("Balsamorhiza", 
"Balsamorhiza", "Symphoricarpos", "Juniperus", "Carex", "Medicago", 
"Gen.", "Agropyron", "Agropyron", "Agropyron", "Bromus", "Elymus", 
"Festuca", "Festuca", "Leymus", "Poa", "Potentilla", "Salix"), 
    Species = c("sagittata", "sp.", "sp.", "sp.", "sp.", "sp.", 
    "sp.", "sp.", "cristatum", "desertorum", "inermis", "sp.", 
    "altaica", "sp.", "cinereus", "pratensis", "sp.", "sp."), 
    Total = c(1L, 2L, 1L, 1L, 2L, 6L, 13L, 22L, 2L, 2L, 2L, 2L, 
    2L, 1L, 2L, 7L, 2L, 2L)), class = "data.frame", row.names = c(NA, 
-18L))

Solution

  • I think you just need to use sub here, replacing the regex pattern ".* -" with "". You can use paste to add the totals.

    ggplot(combined_data, aes(x = factor(radius), y = Total, fill = Group)) +
      geom_bar(stat = "identity", width = 1, color = "white") +
      coord_polar(theta = "y") +
      scale_fill_viridis(discrete = TRUE, option = "D") +
      theme_void() +
      theme(legend.position = "none") +
    
      # Replace ".* -" with "" using `sub` and add the Total with `paste`
      geom_text(aes(label = paste(sub('.*-', '', Group), Total, sep=": ")), 
                position = position_stack(vjust = 0.5), size = 2.5) # + ...
    

    enter image description here

    You can remove the white rectangle at the top by removing the last geom_annotate.