I have a dataset that looks something like this:
results <- as.data.frame(cbind(c("Violence", "Violence", "Violence", "Violence", "Economic", "Economic","Economic","Economic","Institutional","Institutional","Institutional","Institutional"),
c("No", "No", "Yes", "Yes","No", "No", "Yes", "Yes", "No", "No", "Yes", "Yes"),
c("Yes", "No", "Yes", "No","Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No"),
c(3,3,1,3,4,5,8,7,6,5,4,3)))
colnames(results) <- c("Type", "Test1", "Test2", "Freq")
Then I create an alluvial plot with ggalluvial
library(ggplot2)
library(tidyverse)
library(ggalluvial)
ggplot(data = results,
aes(axis1 = Type, axis2 = Test1, axis3 = Test2,
y = Freq)) +
scale_x_discrete(limits = c("Article", "False 0s Removed", "New Flow Measure"), expand = c(.2, .05)) +
xlab("Results") +
geom_flow(aes(fill = Type)) +
geom_stratum() +
geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
theme_minimal() +
ggtitle("Replication Summary")
This looks fine except for each stratum I want the vertical order to be organized by the color (Type), so that each stratum is a bar chart of sorts where I can see what percentage of each Type are No and Yes for each test. How would I change so the vertical ordering is grouped by color ($type) at each stratum (Test1 and Test2). At current the second stratum (Test 1) looks good but the the third does not (test 2)
If I understood it correctly, the only thing you have to add is aes.bind = 'flow'
in geom_flow()
.
results <- data.frame(Type = c("Violence", "Violence", "Violence", "Violence", "Economic", "Economic","Economic","Economic","Institutional","Institutional","Institutional","Institutional"),
Test1 = c("No", "No", "Yes", "Yes","No", "No", "Yes", "Yes", "No", "No", "Yes", "Yes"),
Test2 = c("Yes", "No", "Yes", "No","Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No"),
Freq = c(3,3,1,3,4,5,8,7,6,5,4,3)
)
library(ggplot2)
library(tidyverse)
library(ggalluvial)
ggplot(data = results,
aes(axis1 = Type, axis2 = Test1, axis3 = Test2,
y = Freq)) +
scale_x_discrete(limits = c("Article", "False 0s Removed", "New Flow Measure"), expand = c(.2, .05)) +
xlab("Results") +
geom_flow(aes(fill = Type), aes.bind = 'flow') +
geom_stratum() +
geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
theme_minimal() +
ggtitle("Replication Summary")
Created on 2023-03-07 by the reprex package (v2.0.1)
EDIT: I am not quite sure if it is possible to get the colors in the stratas with geom_flow()
, but you can do it with geom_alluvial()
. For this I changed the way the example data was generated, because in your example Freq
was not numeric and geom_alluvial()
threw an error. Now you can add the fill
-argument to geom_stratum
. If one stratum cannot filled by a single Type its color will be NA
. If you add scale_fill_discrete(na.value = NA)
these strata will become transparent and you can see the colors.
ggplot(data = results,
aes(axis1 = Type, axis2 = Test1, axis3 = Test2,
y = Freq)) +
scale_x_discrete(limits = c("Article", "False 0s Removed", "New Flow Measure"), expand = c(.2, .05)) +
xlab("Results") +
geom_alluvium(aes(fill = Type), aes.bind = "alluvia") +
geom_stratum(aes(fill = Type)) +
scale_fill_discrete(na.value = NA) +
geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
theme_minimal() +
ggtitle("Replication Summary")
Created on 2023-03-07 by the reprex package (v2.0.1)