I have a very similar question to the one asked in Half violin plot with different factors in R which was perfectly answered by @AllanCameron. In addition to the data of the mentioned question I also have different grades:
student_id group grade test_id Score
145 Treatment 2 pre 0.12
145 Treatment 3 post 0.78
109 Treatment 5 pre 0.45
109 Treatment 5 post 0.99
195 Treatment 4 pre 0.22
195 Treatment 4 post 0.75
119 Treatment 6 pre 0.15
119 Treatment 6 post 0.59
I would like to do a half-violin plot where one factor is the pre- and posttest and the two halfs constitute of the 3rd/4th grade for the posttest and the 5th/6th grade for the pretest:
I've played around with the code provided in the previous answer, but the only thing I came up with is plotting the two grades separately as factors and then cutting and pasting them together. Not very elegant! I hope someone has a better way of achieving this.
Here is a MWE:
set.seed(1)
data <- data.frame(
group = rep(sample(c('Treatment', 'Control'), 50, TRUE),
each = 2),
test_id = rep(c('pre', 'post'), 50),
grade = sample(3:6, 100, replace = TRUE),
Score = runif(100)
)
library(ggplot2)
library(see)
ggplot(data, aes(test_id, Score, fill = grade)) +
geom_boxplot(width = 0.1, position = position_dodge(0.2)) +
geom_violinhalf(aes(group = interaction(test_id, grade)), fill = 'gray',
trim = FALSE, flip = c(1, 2)) +
theme_classic(16)
This produces the undesired plot
Not 100% sure whether you want separate boxplots for the groups or just one boxplot but to fix the issue with the violin plots you can filter the data used for geom_violinhalf
, i.e. to include only grades 3 and 4 for the pre-test data and 5 and 6 for the post-test data. Additionally, as you now have four groups you have to set flip=c(1, 3)
to flip the left-hand violins.
Note: For the reprex I mapped "grade"
on fill
in geom_violinhalf
to check and show that it displays the right grades.
library(ggplot2)
library(see)
ggplot(data, aes(test_id, Score, fill = group)) +
geom_boxplot(width = 0.1, position = position_dodge(0.2)) +
geom_violinhalf(
data = ~ subset(
.x,
(test_id %in% "pre" & grade %in% c(3, 4)) |
(test_id %in% "post" & grade %in% c(5, 6))
),
aes(group = interaction(test_id, grade), fill = factor(grade)),
#fill = "gray",
trim = FALSE,
flip = c(1, 3)
) +
scale_x_discrete(limits = c("pre", "post")) +
theme_classic()