I have a scatterpie plot with pies plotted over x and y axes and a "trend line" connecting them. In the spirit of this answer, I would like to add an annotation over each line to mark the percent increase/decrease between the y-values underlying each adjacent pies.
library(tidyverse)
library(scatterpie)
my_df <- structure(list(day_in_july = 13:20, yes_and_yes = c(0.611814345991561,
0.574750830564784, 0.593323216995448, 0.610539845758355, 0.650602409638554,
0.57429718875502, 0.575971731448763, 0.545454545454545), yes_but_no = c(0.388185654008439,
0.425249169435216, 0.406676783004552, 0.389460154241645, 0.349397590361446,
0.42570281124498, 0.424028268551237, 0.454545454545455), y = c(0.388185654008439,
0.425249169435216, 0.406676783004552, 0.389460154241645, 0.349397590361446,
0.42570281124498, 0.424028268551237, 0.454545454545455)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
p <- ggplot(data = my_df) +
geom_path(aes(x=day_in_july, y = y*50)) +
geom_scatterpie(aes(x = day_in_july, y = y*50, r = 0.3),
data = my_df,
cols = colnames(my_df)[2:3],
color = "red") +
geom_text(aes(y = y*50, x = day_in_july,
label = paste0(formatC(y*100, digits = 3), "%")),
nudge_y = 0.07, nudge_x = -0.25, size = 3) +
geom_text(aes(y = y*50, x = day_in_july,
label = paste0(formatC((1-y)*100, digits = 3), "%")),
nudge_y = -0.07, nudge_x = 0.25, size = 3) +
scale_fill_manual(values = c("pink", "seagreen3")) +
scale_x_continuous(labels = xvals, breaks = xvals) +
scale_y_continuous(name = "yes but no",
labels = function(x) x/50) +
coord_fixed()
> p
The y-value of the first pie (at day_in_july
= 13
) is 0.388
. From this y-value to the next pie's y-value (0.425
) there's a percent increase of 9.53%. Therefore, I want to mark the line that connects the two pies with a label of +9.53% .
This answer already has the relevant mechanism to get what I'm looking for.
The idea is to use ggplot_build()
to access the data underlying the plot, then calculate the percent change between two consecutive values, then rebuild the plot with the lines annotated accordingly. However, this solution isn't working for me with the scatterpie plot since the underlying data outputted from ggplot_build
is of its own kind.
plot_data <- ggplot_build(p) %>% ggplot_build(p)$data[[1]] %>% as.tibble()
> plot_data
## # A tibble: 2,904 x 13
## fill group index amount PANEL stringsAsFactors nControl x y colour size linetype alpha
## <chr> <chr> <dbl> <dbl> <fct> <lgl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <lgl>
## 1 pink 1 0 0.612 1 FALSE 221 13 19.7 red 0.5 1 NA
## 2 pink 1 0.00452 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 3 pink 1 0.00905 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 4 pink 1 0.0136 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 5 pink 1 0.0181 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 6 pink 1 0.0226 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 7 pink 1 0.0271 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 8 pink 1 0.0317 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 9 pink 1 0.0362 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 10 pink 1 0.0407 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## # ... with 2,894 more rows
Where are the actual y-values that I need for calculating the percent change between pies' y-values? Obviously, I can get the y-values from the data. But in order to reconstruct the plot, this data from ggplot_build()
doesn't make sense to me, and I don't know how to utilize the technique to add the percentage change between pies to the plot line.
Here is my attempt with the ggrepel package. I basically created a new data frame containing necessary information for geom_label_repel()
. I omit the details of what I did to create foo
. But I think you can read it. I invested a bit of time to find the optimal positions for the label, and this is what I could do for you for now. If you are not happy with the position, you gotta play around by yourself.
foo <- tibble(day_in_july = my_df$day_in_july + 0.5,
y = my_df$y * 50 + (((lead(my_df$y * 50) - (my_df$y * 50))) / 2),
gap = ((lead(my_df$yes_but_no) / my_df$yes_but_no) - 1) * 100) %>%
mutate(gap = paste(round(gap, digits = 2), "%", sep = ""),
hue = ifelse(gap > 0, "green", "red"))
p <- ggplot(data = my_df) +
geom_path(aes(x = day_in_july, y = y*50)) +
geom_scatterpie(aes(x = day_in_july, y = y*50, r = 0.3),
data = my_df,
cols = colnames(my_df)[2:3],
color = "red") +
geom_text(aes(y = y * 50, x = day_in_july,
label = paste0(formatC(y * 100, digits = 3), "%")),
nudge_y = 0.07, nudge_x = -0.25, size = 3) +
geom_text(aes(y = y * 50, x = day_in_july,
label = paste0(formatC((1-y) * 100, digits = 3), "%")),
nudge_y = -0.07, nudge_x = 0.25, size = 3) +
scale_fill_manual(values = c("pink", "seagreen3")) +
geom_label_repel(data = foo,
aes(x = day_in_july, y = y,
color = hue, label = as.character(gap)),
show.legend = FALSE,
nudge_x = 0.3,
direction = "y",
vjust = -1.0) +
scale_color_manual(values = c("green", "red"))