I'm trying to make a proportion bar plot in R. I'm almost done, except for one (extremely pesky) detail. The data I use was imported from a .dta file, and R seems slightly confused as to whether the data are factor or continuous (e.g. a scale with numbers will appear on the x-axis). like this:
proportion_bars <- ggplot(prep_data,aes(x = sex_work_r,
fill = factor(abuse, labels = c("no", "yes"))))+
geom_bar(position = "fill") +
labs(x = "Sex work", y = "proportion", fill = "Were you ever abused?")
proportion_bars
The ggplot function also does not automatically use the labels set to the factor levels, so I'm trying to set the labels through a string in ggplot, but for some reason this doesn't work.
What I would like to do is to add labels to the separate bars in the plot.
For a reproductive example:
I tried this (with the data file I'm using)
proportion_bars <- ggplot(titanic_train,aes(x = Sex,
fill = factor(Survived,
labels = c("yes", "no"))))+
scale_x_discrete(limit =(c("F", "M"))) +
geom_bar(position = "fill") +
labs(x = "Sex", y = "proportion survival", fill = "did they survive?")
I would just like the bars to stay symmetrical (as in the first picture, but without the numbers on the x-axis and with labels for each bar).
All the information I can find is about complicated things like printing values from a vector right above each bar. But I can't find anything on this particular issue.
Does anyone have any tips on how to solve this?
Simply convert the variable mapped on x
to a factor
too as you already did for the variable you mapped on fill
.
Using a minimal reproducible example based on some fake random data:
library(ggplot2)
set.seed(123)
prep_data <- expand.grid(
sex_work_r = sample(0:1, 100, replace = TRUE),
abuse = sample(0:1, 100, replace = TRUE)
)
ggplot(prep_data, aes(
x = factor(sex_work_r, labels = c("no", "yes")),
fill = factor(abuse, labels = c("no", "yes"))
)) +
geom_bar(position = "fill") +
labs(x = "Sex work", y = "proportion", fill = "Were you ever abused?")