I'm working on some plots using ggplot2
to represent likert scale data, and have a need to repel some labels, but not others. Following plenty of answers found here on StackOverflow, I came up with the following code, but the repelled labels are in the wrong location on the plot.
#### Minimum Working Dataset
### Data Creation
## Initial Tibble
survey <- tibble(
question_n = 1,
answer = c("Somewhat Agree", "Somewhat Disagree", "Strongly Agree", "Strongly Disagree"),
n = c(90, 12, 199, 4),
respondents = 305,
pct = n / respondents
)
## Factor levels for answers
survey$answer <- factor(survey$answer,
levels = c("Strongly Agree", "Somewhat Agree",
"Somewhat Disagree", "Strongly Disagree"))
From here, it's a simple enough ggplot
to create:
### Plot
survey %>%
ggplot(aes(x = pct, y = 1, fill = fct_rev(answer))) +
geom_col(color = "black") +
theme_minimal() +
scale_x_continuous(labels = label_percent(),
# Expand so the labels aren't off-plot
expand = expansion(mult = c(0.025, 0.025))) +
scale_y_discrete(labels = NULL) +
geom_label(aes(label = percent_format(accuracy = 1)(pct),
color = fct_rev(answer)),
fill = "white",
size = 3.25,
fontface = "bold",
label.size = 1,
label.r = unit(2.5, "pt"),
show.legend = FALSE,
position = position_stack(vjust = 0.5, reverse = FALSE),) +
scale_fill_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4")) +
scale_color_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4"), guide = "none") +
guides(fill = guide_legend(position = "bottom", nrow = 2, reverse = TRUE)) +
labs(
title = NULL,
subtitle = NULL,
caption = paste("Respondents N =", survey[1,]$respondents),
fill = NULL,
color = NULL,
x = NULL,
y = NULL
)
Obviously, this needs a geom_label_repel()
! The "Disagree" response labels overlap with one another. So, I changed geom_label()
to geom_label_repel()
:
### Plot
survey %>%
ggplot(aes(x = pct, y = 1, fill = fct_rev(answer))) +
geom_col(color = "black") +
theme_minimal() +
scale_x_continuous(labels = label_percent(),
# Expand so the labels aren't off-plot
expand = expansion(mult = c(0.025, 0.025))) +
scale_y_discrete(labels = NULL) +
geom_label_repel(aes(label = percent_format(accuracy = 1)(pct),
color = fct_rev(answer)),
# Filter data to only less than 5.5% for repel; labels fit otherwise
# data = . %>% filter(pct < 0.055),
fill = "white",
size = 3.25,
fontface = "bold",
label.size = 1,
label.r = unit(2.5, "pt"),
show.legend = FALSE,
position = position_stack(vjust = 0.5, reverse = FALSE),
# Set direction so that repel is only "up" or "down" on plot
direction = "y",
# Set ylim to prevent labels going off the bar
ylim = c(.6, 1.3),
# Set seed so they always place in same position
seed = 12345
) +
scale_fill_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4")) +
scale_color_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4"), guide = "none") +
guides(fill = guide_legend(position = "bottom", nrow = 2, reverse = TRUE)) +
labs(
title = NULL,
subtitle = NULL,
caption = paste("Respondents N =", survey[1,]$respondents),
fill = NULL,
color = NULL,
x = NULL,
y = NULL
)
This technically works, but it looks really messy. The 65% and 30% results have been repelled even though there was no need to do so. So, finally, I tried to include both geom_label()
and geom_label_repel()
:
### Plot
survey %>%
ggplot(aes(x = pct, y = 1, fill = fct_rev(answer))) +
geom_col(color = "black") +
theme_minimal() +
scale_x_continuous(labels = label_percent(),
# Expand so the labels aren't off-plot
expand = expansion(mult = c(0.025, 0.025))) +
scale_y_discrete(labels = NULL) +
geom_label_repel(aes(label = percent_format(accuracy = 1)(pct),
color = fct_rev(answer)),
# Filter data to only less than 5.5% for repel; labels fit otherwise
data = . %>% filter(pct < 0.055),
fill = "white",
size = 3.25,
fontface = "bold",
label.size = 1,
label.r = unit(2.5, "pt"),
show.legend = FALSE,
position = position_stack(vjust = 0.5, reverse = FALSE),
# Set direction so that repel is only "up" or "down" on plot
direction = "y",
# Set ylim to prevent labels going off the bar
ylim = c(.6, 1.3),
# Set seed so they always place in same position
seed = 12345
) +
geom_label(aes(label = percent_format(accuracy = 1)(pct),
color = fct_rev(answer)),
# Filter data to everything greater than 5.5%; no need to repel these items
data = . %>% filter(pct >= 0.055),
fill = "white",
size = 3.25,
fontface = "bold",
label.size = 1,
label.r = unit(2.5, "pt"),
show.legend = FALSE,
position = position_stack(vjust = 0.5, reverse = FALSE),) +
scale_fill_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4")) +
scale_color_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4"), guide = "none") +
guides(fill = guide_legend(position = "bottom", nrow = 2, reverse = TRUE)) +
labs(
title = NULL,
subtitle = NULL,
caption = paste("Respondents N =", survey[1,]$respondents),
fill = NULL,
color = NULL,
x = NULL,
y = NULL
)
So, the 65% and 30% are in the right position, but the 4% and 1% are now in the incorrect x position. I've tried a few things to adjust this, like adding an x =
value to the aes()
, specifying a nudge_x =
position instead of position_stack()
, and several others I can't actually recall right now. I've been tearing my hair out the last few hours trying to solve this.
I need the 65% and 30% where they are, and the other two values where they're supposed to be on the one axis, and nudged like they are on the other axis. Any suggestions?
Thanks for the comments on the post, and the answer provided above. I used a combination of the two to come to a solution which works great. For geom_label_repel()
and geom_label()
, I used a label = ifelse()
statement rather than filtering the data itself. Thanks again to the commenter and the proposed answer for helping me solidify this result.
# Set repel threshold
threshold <- 0.055
# Plot
survey %>%
ggplot(aes(x = pct, y = 1, fill = fct_rev(answer))) +
geom_col(color = "black") +
theme_minimal() +
scale_x_continuous(labels = label_percent(),
# Expand so the labels aren't off-plot
expand = expansion(mult = c(0.025, 0.025))) +
scale_y_discrete(labels = NULL) +
geom_label_repel(aes(label = ifelse(pct < threshold, percent_format(accuracy = 1)(pct), NA),
color = fct_rev(answer)),
fill = "white",
size = 3.25,
fontface = "bold",
label.size = 1,
label.r = unit(2.5, "pt"),
show.legend = FALSE,
na.rm = TRUE,
position = position_stack(vjust = 0.5, reverse = FALSE),
# Set direction so that repel is only "up" or "down" on plot
direction = "y",
# Set ylim to prevent labels going off the bar
ylim = c(.6, 1.3),
# Set seed so they always place in same position
seed = 12345
) +
geom_label(aes(label = ifelse(pct >= threshold, percent_format(accuracy = 1)(pct), NA),
color = fct_rev(answer)),
fill = "white",
size = 3.25,
fontface = "bold",
label.size = 1,
label.r = unit(2.5, "pt"),
show.legend = FALSE,
na.rm = TRUE,
position = position_stack(vjust = 0.5, reverse = FALSE),) +
scale_fill_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4")) +
scale_color_manual(values = c("tomato4", "tomato", "royalblue", "royalblue4"), guide = "none") +
guides(fill = guide_legend(position = "bottom", nrow = 2, reverse = TRUE)) +
labs(
title = NULL,
subtitle = NULL,
caption = paste("Respondents N =", survey[1,]$respondents),
fill = NULL,
color = NULL,
x = NULL,
y = NULL
)
It also ended up working for all 20-ish plots in my .Rmd
file, with some modifications here and there to the seed =
argument to get them to place where I wanted.