I have a dataset with four variables measuring respondents' view on different topics. I want to plot them into one stacked bar chart so you can compare the values between the different topics.
This are the first rows of the dataset:
lebanon <- structure(list(climate_change = c(
"Not a very serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"A somewhat serious problem"
), air_quality = c(
"A somewhat serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"A very serious problem"
), water_polution = c(
"A somewhat serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"Not at all a serious problem"
), trash = c(
"A very serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"A somewhat serious problem"
)), row.names = c(NA, -6L), class = "data.frame")
I did try with the following code based on this site:
lebanon %>%
filter(!is.na(climate_change), !is.na(air_quality), !is.na(water_polution), !is.na(trash)) %>%
gather(variable, value, climate_change:trash) %>%
ggplot(aes(x = variable, y = value, fill = value)) +
geom_bar(stat = "identity") +
coord_flip()
Getting this graph:
There are three problems with this graph.
1.) The bar graphs are not the same length.
2.) I don't why there is something written at the location where x-axis hits the y-axis. How do I remove this?
3.) I want to order the values so they make sense, so I orderer them before with:
dataset$climate_change <- factor(dataset$climate_change, levels = c("Not at all a serious problem",
"Not a very serious problem",
"A somewhat serious problem",
"A very serious problem"))
dataset$air_quality <- factor(dataset$air_quality, levels = c("Not at all a serious problem",
"Not a very serious problem",
"A somewhat serious problem",
"A very serious problem"))
dataset$water_polution <- factor(dataset$water_polution, levels = c("Not at all a serious problem",
"Not a very serious problem",
"A somewhat serious problem",
"A very serious problem"))
Yet the values are still unorderer. What am I doing wrong? Or is there a more effective way to make a multiple stacked bar chart?
The main issue with cour code is that you mapped value
, i.e. a factor var, on y
. Further you can simply use drop_na
instead of filter and simply that the levels of value after the gather instead of repeating it for each var. (; Try this:
BTW: Please put your data into the post with dput()
, e.g. dput(head(lebanon))
. See my edit to your post. Took more time to clean and get the data right than answering the question. (;
** EDIT ** To get the bars ordered in the wanted order I make use of the forcats
package. First I add_count
the number of respondents thinking the issue is "A very serious problem". Then I fct_reorder
variable
accordingly, i.e. -n
to get it descending. To reverse the order of value
I make use of fct_rev
.
lebanon <- structure(list(climate_change = c(
"Not a very serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"A somewhat serious problem"
), air_quality = c(
"A somewhat serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"A very serious problem"
), water_polution = c(
"A somewhat serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"Not at all a serious problem"
), trash = c(
"A very serious problem",
"Not a very serious problem", NA, NA, "A very serious problem",
"A somewhat serious problem"
)), row.names = c(NA, -6L), class = "data.frame")
library(tidyverse)
lebanon %>%
drop_na() %>%
gather(variable, value, climate_change:trash) %>%
add_count(variable, value == "A very serious problem") %>%
mutate(value = factor(value, levels = c("Not at all a serious problem",
"Not a very serious problem",
"A somewhat serious problem",
"A very serious problem"))) %>%
ggplot(aes(x = forcats::fct_reorder(variable, -n), fill = forcats::fct_rev(value))) +
geom_bar() +
coord_flip()