rggplot2visualizationgeom

Calendar plot with geom_segment()


I have the following dataset in R. I aim to do a ggplot where the scale goes from 1 to 12 (January, February, ..., December) in the x-axis, and the y-axis goes from 1 to 6 ( num_months variable [in the example only 1 and 6]). Then, I want to use geom_segment(), with the minimum being start_month and the maximum being end_month (so they represent the num_months). I want to facet horizontally by the variable year.

My main problems, so far, are:

  1. I Want the geom_segment to occupy the "full month/s", so if the starting and ending month is 5, meaning May, I want it to start in May and end at the beginning of June (6);
  2. There are several segments with the same duration (num_months), but I want them to be arranged parallel so they don't overlap and show what I want it to show.
  3. I want the num_months to look more like panels because it is confusing in terms of data visualization. Now I did some boxes but some lines are out of the num_months it is and the lines go beyond so they are not boxes.
data <- read_csv("num_months,start_month_year,end_month_year,B1,B1_p,year,start_month,end_month
1,6,6,3.3571016788482666,0.007681768853217363,2021,5,5
1,8,8,2.548985481262207,0.007373321335762739,2021,7,7
1,10,10,2.139772415161133,0.03452971577644348,2021,9,9
1,12,12,2.165775775909424,0.07796278595924377,2021,11,11
1,13,13,1.9506219625473022,0.09215697646141052,2021,12,12
1,23,23,2.7839596271514893,0.011407249607145786,2022,10,10
1,25,25,2.220555543899536,0.06181173026561737,2022,12,12
6,6,11,0.9881601333618164,0.08719704300165176,2021,5,10
6,8,13,1.438501238822937,0.032221969217061996,2021,7,12
6,9,14,1.16400945186615,0.09187468141317368,2021,8,1
6,10,15,1.5834165811538696,0.03494146466255188,2021,9,2
6,11,16,1.294316291809082,0.09792502969503403,2021,10,3
6,12,17,1.4204859733581543,0.0546354204416275,2021,11,4
6,20,25,1.07038414478302,0.0722803920507431,2022,7,12") %>%
  mutate(
    end_month = ifelse(start_month == end_month, end_month + 1, end_month),
    end_month = ifelse(end_month > 12, 1, end_month)  # Wrap around to January if end_month exceeds 12
  ) %>%
  group_by(year, num_months) %>%
  mutate(
    y_pos = num_months + (row_number() - 1) * 0.2  # Adding a systematic offset to y position
  ) %>%
  ungroup()

# Create the boxes for num_months
boxes <- data %>%
  group_by(year, num_months) %>%
  summarise(
    ymin = min(y_pos) - 0.3,
    ymax = max(y_pos) + 0.3
  ) %>%
  ungroup()

# Create the ggplot
p <- ggplot(data) +
  geom_rect(data = boxes, aes(xmin = 0.5, xmax = 12.5, ymin = ymin, ymax = ymax), fill = NA, color = "grey") +
  geom_segment(aes(x = start_month, xend = end_month, y = y_pos, yend = y_pos, color = as.factor(num_months)), size = 1) +
  scale_x_continuous(breaks = 1:12, limits = c(0.5, 12.5), labels = month.abb) +
  scale_y_continuous(breaks = 1:6, limits = c(0.5, 6.5), expand = expansion(mult = c(0.02, 0.1))) +  # Adjusting y-axis limits to accommodate offset
  facet_wrap(~ year) +
  labs(x = "Month", y = "Number of Months", color = "Number of Months") +
  theme_minimal() +
  theme(panel.spacing = unit(1, "lines"))  # Increase spacing between panels

print(p)

Here's how it looks: segments that last the same number of months overlap. Also, lines go into the panel for different num_months.

enter image description here


Solution

  • Here's my suggestion. The big changes are:

    Since theme_minimal() doesn't draw panels for its facets, I switched to theme_bw(), but you can of course customize the theming however you want.

    data <- read_csv("num_months,start_month_year,end_month_year,B1,B1_p,year,start_month,end_month
    1,6,6,3.3571016788482666,0.007681768853217363,2021,5,5
    1,8,8,2.548985481262207,0.007373321335762739,2021,7,7
    1,10,10,2.139772415161133,0.03452971577644348,2021,9,9
    1,12,12,2.165775775909424,0.07796278595924377,2021,11,11
    1,13,13,1.9506219625473022,0.09215697646141052,2021,12,12
    1,23,23,2.7839596271514893,0.011407249607145786,2022,10,10
    1,25,25,2.220555543899536,0.06181173026561737,2022,12,12
    6,6,11,0.9881601333618164,0.08719704300165176,2021,5,10
    6,8,13,1.438501238822937,0.032221969217061996,2021,7,12
    6,9,14,1.16400945186615,0.09187468141317368,2021,8,1
    6,10,15,1.5834165811538696,0.03494146466255188,2021,9,2
    6,11,16,1.294316291809082,0.09792502969503403,2021,10,3
    6,12,17,1.4204859733581543,0.0546354204416275,2021,11,4
    6,20,25,1.07038414478302,0.0722803920507431,2022,7,12") %>%
      mutate(
        start_dt = ymd(paste("2023", start_month, "01", sep = "-")),
        end_dt = ceiling_date(ymd(paste("2023", end_month, "01", sep = "-")), unit = "month") - 1,
        end_month = ifelse(start_month == end_month, end_month + 1, end_month),
        end_month = ifelse(end_month > 12, 1, end_month)  # Wrap around to January if end_month exceeds 12
      ) %>%
      group_by() %>%
      mutate(
        y_pos = num_months + (row_number() - 1) * 0.2,  # Adding a systematic offset to y position
        yy = row_number(),
        .by = c(year, num_months)
      )
    
    
    ggplot(data) +
      geom_segment(aes(x = start_dt, xend = end_dt, y = yy, yend = yy, color = factor(num_months)), size = 1) +
      scale_x_date(
        date_labels = "%b", 
        date_breaks = "1 month",
        limits = ymd(c("2023-01-01", "2023-12-31")),
        expand = expansion(0, 0)
      ) +
      scale_y_continuous(labels = NULL) +
      facet_grid(rows = vars(num_months), cols = vars(year), space = "free_y", scales = "free_y") +
      labs(x = "Month", y = "Number of Months", color = "Number of Months") +
      theme_bw() +
      theme(
        panel.spacing = unit(1, "lines"),  # Increase spacing between panels
        panel.grid.major.y = element_blank(),
        axis.ticks.y = element_blank()
      )
    

    enter image description here