rggplot2

Imputing zero in geom_area rather than NA


I've been having some trouble finding the words to describe the issue I am facing. Here is an example of what I want to see

library(tidyr)
library(dplyr)
library(ggplot2)

df <- tribble(
  ~x, ~y, ~z,
  1,1,"a",
  2,2,"a",
  3,3,"a",
  1,4,"b",
  2,0,"b",
  3,6,"b",
  1,7,"c",
  2,8,"c",
  3,9,"c"
)

df %>% 
  ggplot(aes(x, y, fill = z)) +
  geom_area(position = "stack") +
  geom_line(position = "stack")

ggplot I do want

However, if I change the dataset by removing a record

library(tidyr)
library(dplyr)
library(ggplot2)

df <- tribble(
  ~x, ~y, ~z,
  1,1,"a",
  2,2,"a",
  3,3,"a",
  1,4,"b",
  # 2,0,"b", << remove this record
  3,6,"b",
  1,7,"c",
  2,8,"c",
  3,9,"c"
)

df %>% 
  ggplot(aes(x, y, fill = z)) +
  geom_area(position = "stack") +
  geom_line(position = "stack")

Then I get this plot.

ggplot I don't want

I believe this is the intended behavior. My question is whether there is a way to get the first plot from the second dataset using ggplot2 only without having to modify the dataframe df. I am imagining some kind of coalesce functionality in the geom somehow.


Solution

  • TBMK ggplot2 offers no option to complete the data. But to be honest, I don't know what's the issue when using e.g. tidyr::complete offers an easy solution for this job:

    library(tidyr)
    library(dplyr)
    library(ggplot2)
    
    df %>%
      complete(x, z, fill = list(y = 0)) |> 
      ggplot(aes(x, y, fill = z)) +
      geom_area(position = "stack") +
      geom_line(position = "stack")