rggplot2bar-chartlabeling

Setting string labels to bar lines in bar graph ggplot


I'm trying to make a proportion bar plot in R. I'm almost done, except for one (extremely pesky) detail. The data I use was imported from a .dta file, and R seems slightly confused as to whether the data are factor or continuous (e.g. a scale with numbers will appear on the x-axis). like this:

proportion_bars <- ggplot(prep_data,aes(x = sex_work_r,
                          fill = factor(abuse, labels = c("no", "yes"))))+ 
  geom_bar(position = "fill") +
 labs(x = "Sex work", y = "proportion", fill = "Were you ever abused?")

proportion_bars

enter image description here

The ggplot function also does not automatically use the labels set to the factor levels, so I'm trying to set the labels through a string in ggplot, but for some reason this doesn't work.

What I would like to do is to add labels to the separate bars in the plot.

For a reproductive example:

I tried this (with the data file I'm using)

proportion_bars <- ggplot(titanic_train,aes(x = Sex,
                                    fill = factor(Survived, 
                                                  labels = c("yes", "no"))))+ 
  scale_x_discrete(limit =(c("F", "M"))) +
  geom_bar(position = "fill") +
  labs(x = "Sex", y = "proportion survival", fill = "did they survive?")

but what I get: enter image description here

I would just like the bars to stay symmetrical (as in the first picture, but without the numbers on the x-axis and with labels for each bar).

All the information I can find is about complicated things like printing values from a vector right above each bar. But I can't find anything on this particular issue.

Does anyone have any tips on how to solve this?


Solution

  • Simply convert the variable mapped on x to a factor too as you already did for the variable you mapped on fill.

    Using a minimal reproducible example based on some fake random data:

    library(ggplot2)
    
    set.seed(123)
    prep_data <- expand.grid(
      sex_work_r = sample(0:1, 100, replace = TRUE),
      abuse = sample(0:1, 100, replace = TRUE) 
    )
    
    ggplot(prep_data, aes(
      x = factor(sex_work_r, labels =  c("no", "yes")),
      fill = factor(abuse, labels = c("no", "yes"))
    )) +
      geom_bar(position = "fill") +
      labs(x = "Sex work", y = "proportion", fill = "Were you ever abused?")