rggplot2bar-chartpercentagestacked

Force bars/rows in stacked bar chart (ggplot showing percent, zero decimal places) to be of same lenght?


I am new to R and I need to create a stacked bar chart which shows percent within bars (stacks) and goes from 0 to 100 percent. I have over 7000 rows and 12 vectors that range from value 1 trough 5 (the value 9999 is to be excluded from the chart). For each vector I have calculated a percentage over the 5 values, so that each bar in the chart should sum to 100 %.

The problem I encounter is this: Two of the vectors sum to 99 % and one vector sums to 101 %, while the rest sum to 100 as expected. The chart therefore has two bars/rows that are a bit shorter and one bar/row that is a bit longer than the rest of the bars. This problem disappears if I round to 1 or 2 decimal places, rather than asking for 0 decimal places (which, sadly, is what I have to do).

Is there any way to force ggplot to show bars/rows of identical length (100 %) ?

Any help is greatly appreciated!


EDIT!

I removed the two images of my data and the ggplot. I have produced a working example which simulates my data and shows how bars/rows differ in lenght when I ask for 0 decimal places.

I am well aware that this issue could be solved by asking for 1 decimal place. My employer wants 0, which is why I hope to find a solution for this.

library(reprex)
library(ggplot2)
library(tidyverse)


# Creating some data

item1 <- floor(runif(7200, min=1, max=6))
item2 <- floor(runif(7200, min=1, max=6))
item3 <- floor(runif(7200, min=1, max=6))
item4 <- floor(runif(7200, min=1, max=6))
item5 <- floor(runif(7200, min=1, max=6))
item6 <- floor(runif(7200, min=1, max=6))
item7 <- floor(runif(7200, min=1, max=6))
item8 <- floor(runif(7200, min=1, max=6))
item9 <- floor(runif(7200, min=1, max=6))
item10 <- floor(runif(7200, min=1, max=6))
item11 <- floor(runif(7200, min=1, max=6))
item12 <- floor(runif(7200, min=1, max=6))


df <- data.frame(item1, item2, item3, item4, item5, item6, item7, item8, item9, item10, item11, item12)


# Drawing the ggplot with 0 (zero) decimal places

df %>%
    gather %>% 
    group_by(key, value) %>%
    tally %>%
    mutate(n = round(n/sum(n)*100,0)) %>%
    ggplot(aes(x=key, y=n, fill=as.factor(value))) + 
    geom_col(position = position_stack(reverse = T)) +
    labs(title = "Some title", x = " ", y = "%", fill = " ") +
    geom_text(aes(label=n), position=position_stack(reverse = TRUE, vjust = 0.5), size = 3, colour = "white") +
    theme(legend.position="top") +
    coord_flip() +
    theme_minimal() 

Created on 2022-04-06 by the reprex package (v2.0.1)


Solution

  • A very simple solution to your problem is to change from position = position_stack() to position = position_fill().

    library(tidyverse)
    library(scales)
    
    # Creating some data
    df <- data.frame(matrix(floor(runif(12*7200, 1, 6)), ncol = 12, dimnames = list(NULL, paste0("item", 1:12))))
    
    # Drawing the ggplot with 0 (zero) decimal places
    df %>%
      gather %>% 
      group_by(key, value) %>%
      tally %>%
      mutate(n = round(n/sum(n) * 100, 0)) %>%
      ggplot(aes(x = key, y = n, fill = as.factor(value))) + 
      geom_col(position = position_fill(reverse = T)) +
      labs(title = "Some title", x = " ", fill = " ") +
      scale_y_continuous(name = "%", labels = percent) +
      geom_text(aes(label = n), position = position_fill(reverse = TRUE, vjust = 0.5), size = 3, colour = "white") +
      theme(legend.position = "top") +
      coord_flip() +
      theme_minimal() 
    

    Created on 2022-04-06 by the reprex package (v2.0.1)