rggplot2linemissing-data

In R, when missing values exist, how to draw line graphs?


Here is one dataset.

df=tibble::tibble(
  treatment = rep(c("A", "B", "C"), each = 51L),
  day = rep(seq(0, 50, by = 1), 3),
  mean = rep(
    c(
      0, NA, 6.75, NA, 24.75, NA, 35.75, NA, 0, NA, 0.75, NA, 2.5, NA, 4.5, NA, 0,
      NA, 0.5, NA, 3.25, NA, 3.75, NA
    ),
    rep(c(1L, 34L, 1L, 6L, 1L, 3L, 1L, 4L), 3)
  ),
  std_error = rep(
    c(
      0, NA, 1.887458608817687, NA, 5.02286438863988, NA, 8.178987304876646, NA, 0,
      NA, 0.4787135538781691, NA, 1.040832999733066, NA, 1.755942292142123, NA, 0,
      NA, 0.2886751345948129, NA, 1.436140661634507, NA, 1.25, NA
    ),
    rep(c(1L, 34L, 1L, 6L, 1L, 3L, 1L, 4L), 3)
  ),
)

and I want to draw a line graph, using the following code.

library(ggplot2)
ggplot(data=df, aes(x=as.numeric(day), y=mean, fill=treatment, shape=treatment, color=treatment)) + 
  geom_errorbar(aes(ymin=mean-std_error, ymax=mean+std_error), 
                position=position_dodge(0.7), width=3, color="grey35") +
  
  geom_line(aes(color=treatment), size=0.5) +
  geom_point(size=5, stroke=1, color="white") +
  
  scale_shape_manual(values=c(21,22,24)) +
  scale_fill_manual(values=c("red","grey55", "grey15")) +
  scale_color_manual(values=c("red","grey55", "grey15")) +
  
  scale_x_continuous(breaks=seq(0,50,10),limits=c(0,50)) +
  scale_y_continuous(breaks=seq(0,50,10),limits=c(0,50)) +
  theme_classic(base_size=20, base_family="serif") +
  theme(legend.position=c(0.15,0.85),
        legend.title=element_blank(),
        legend.key=element_rect(color="white", fill="white"),
        legend.text=element_text(family="serif", face="plain",
                                 size=13, color= "Black"),
        legend.background=element_rect(fill=alpha(0.5)),
        axis.line=element_line(linewidth=0.5, colour="black"))+
  windows(width=5, height=5.5)

However, lines cannot be generated.

enter image description here

If I delete all missing values, I can draw a line graph.

library(dplyr)
df_na_trit= df %>% 
            filter(is.na(mean)== F)

I need this data structure because I'll add more data based on the day from 0 to 50.

So, maintaining this data structure, how can I make a line graph?

Thanks,


Solution

  • You could just drop the NA values on the fly right before plotting using df %>% drop_na() as the data argument in your ggplot call:

    library(ggplot2)
    library(dplyr)
    library(tidyr)
    ggplot(data=df %>% drop_na(), aes(x=as.numeric(day), y=mean, fill=treatment, shape=treatment, color=treatment)) + 
      geom_errorbar(aes(ymin=mean-std_error, ymax=mean+std_error), 
                    position=position_dodge(0.7), width=3, color="grey35") +
      
      geom_line(aes(color=treatment), size=0.5) +
      geom_point(size=5, stroke=1, color="white") +
      
      scale_shape_manual(values=c(21,22,24)) +
      scale_fill_manual(values=c("red","grey55", "grey15")) +
      scale_color_manual(values=c("red","grey55", "grey15")) +
      
      scale_x_continuous(breaks=seq(0,50,10),limits=c(0,50)) +
      scale_y_continuous(breaks=seq(0,50,10),limits=c(0,50)) +
      theme_classic(base_size=20, base_family="serif") +
      theme(legend.position=c(0.15,0.85),
            legend.title=element_blank(),
            legend.key=element_rect(color="white", fill="white"),
            legend.text=element_text(family="serif", face="plain",
                                     size=13, color= "Black"),
            legend.background=element_rect(fill=alpha(0.5)),
            axis.line=element_line(linewidth=0.5, colour="black"))+
      windows(width=5, height=5.5)