rdatetimeggplot2geom-segment

Changing axis tick marks for geom_segment() with time data


I have a dataset with measures at multiple timepoints, with the expected time vs. actual time is the measurement took place. I'm trying to display the difference in the expected vs. the actual time for each subject. My issue is the times on the x-axis of my plot are unreadable, and I'm getting an error when I try to manually set tick marks on the x-axis. ![enter image description here] This is my plot without setting the breakpoints: (https://i.sstatic.net/X7Ll5.png)

I'm not sure what format I should be putting my breaks in. I'm guessing that is wants times, but nothing i do seems to work. I don't really know what else to try. Is there a way I can just tell it to show fewer breakpoints?

Here is what I've tried and the errors I've gotten.

>  ggplot(df) + 
+    geom_segment(aes(x = exp, xend = act, y = subject_id, yend = subject_id, color = timef)) +
+    scale_x_continuous(name = "diff. in time", 
+                       breaks = c("14:30", "15:00", "15:30","16:00", "16:30","17:00","17:30"))
Error in `train_continuous()`:
! Discrete value supplied to a continuous scale

>  ggplot(df) + 
+    geom_segment(aes(x = exp, xend = act, y = subject_id, yend = subject_id, color = timef)) +
+    scale_x_continuous(name = "diff. in time", 
+                       breaks = strptime(c("14:30", "15:00", "15:30","16:00", "16:30","17:00","17:30"), "%H:%M"))
Error in `train_continuous()`:
! Discrete value supplied to a continuous scale
Run `rlang::last_trace()` to see where the error occurred.

>  ggplot(df) + 
+    geom_segment(aes(x = exp, xend = act, y = subject_id, yend = subject_id, color = timef)) +
+    scale_x_discrete(name = "diff. in time", 
+                       breaks = strptime(c("14:30", "15:00", "15:30","16:00", "16:30","17:00","17:30"), "%H:%M"))
Error in `mapped_discrete()`:
! Can't convert `x` <list> to <double>.
Run `rlang::last_trace()` to see where the error occurred

here is example of my data:

df <- structure(list(subject_id = c(102L, 166L, 123L, 152L, 173L, 151L, 
138L, 155L, 183L, 148L, 139L, 141L, 112L, 123L, 179L, 166L, 150L, 
183L, 155L, 172L), timef = c(5L, 5L, 10L, 2L, 2L, 2L, 3L, 7L, 
3L, 4L, 6L, 4L, 6L, 2L, 10L, 4L, 3L, 8L, 10L, 3L), exp = c("16:08", 
"15:31", "17:28", "15:08", "15:02", "15:08", "15:51", "16:16", 
"15:32", "16:16", "15:58", "15:28", "16:31", "15:39", "16:34", 
"15:21", "15:08", "16:37", "17:16", "15:38"), act = c("16:11", 
"15:28", "17:28", "15:07", "15:02", "15:06", "15:51", "16:14", 
"15:32", "16:17", "15:58", "15:27", "16:29", "15:39", "16:34", 
"15:21", "15:08", "16:37", "17:16", "15:41")), row.names = c("102.1.5", 
"166.2.5", "123.2.10", "152.1.2", "173.1.2", "151.2.2", "138.2.3", 
"155.1.7", "183.1.3", "148.2.4", "139.2.6", "141.2.4", "112.2.6", 
"123.1.2", "179.2.10", "166.1.4", "150.1.3", "183.1.8", "155.1.10", 
"172.1.3"), class = "data.frame")

Solution

  • R doesn't know that your character strings represent hours and minutes, so it treats them as categorical variables. Each different exp and actual gets its own label on the x axis, which is just ordered alphabetically.

    To fix this, you need to convert your times to date-times (the actual date doesn't matter, since it will never be displayed). Use scale_x_datetime and set date_labels = "%H:%M" so that only the times show.

    library(tidyverse)
    
    df %>%
      mutate(across(exp:act, ~as.POSIXct(paste(Sys.Date(), .x))),
             subject_id = factor(subject_id)) %>%
      ggplot() +
      geom_segment(aes(x = exp, xend = act, y = subject_id, 
                       yend = subject_id, color = timef), linewidth = 2) +
      scale_x_datetime("time", date_labels = "%H:%M", date_breaks = "30 min") +
      scale_color_viridis_c() +
      theme_minimal(base_size = 16)
    

    enter image description here

    If I might make a suggestion, my guess is that you are more interested in the comparison of expected versus actual times than the absolute times, but that is impossible to see in your current plot. Here's an alternative:

    df %>%
      mutate(across(exp:act, ~as.POSIXct(paste(Sys.Date(), .x))),
             time_diff = as.numeric((act - exp)/60),
             subject_id = factor(subject_id)) %>%
      slice_head(n = 1, by = subject_id) %>%
      ggplot(aes(x = time_diff, y = subject_id, color = timef)) +
      geom_point() +
      geom_point(aes(x = 0)) +
      geom_segment(aes(xend = 0, yend = subject_id), linewidth = 2.3) +
      annotation_custom(grid::textGrob(c("Early", "Late"), c(0.25, 0.75), 0.9)) +
      scale_x_continuous("Actual time compared to expected time (minutes)",
                         breaks = seq(-5, 5)) +
      scale_color_viridis_c() +
      theme_minimal(base_size = 16)
    

    enter image description here