example data:
test <- structure(list(date1 = structure(c(1632745800, 1632745800), tzone = "UTC", class = c("POSIXct",
"POSIXt")), date2 = structure(c(1641468180, 1641468180), tzone = "UTC", class = c("POSIXct",
"POSIXt"))), row.names = c(NA, -2L), class = c("tbl_df", "tbl",
"data.frame"))
Is there a reason why the output of difftime differs based on whether the inputs are wrapped by as.character
or not? For example:
library(tidyverse)
test <- structure(list(date1 = structure(c(1632745800, 1632745800),
tzone = "UTC", class = c("POSIXct", "POSIXt")),
date2 = structure(c(1641468180, 1641468180), tzone = "UTC", class = c("POSIXct", "POSIXt"))),
row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"))
test %>% mutate(date_diff = difftime(date2, date1, units = "days"),
date_diff2 = difftime(as.character(date2), as.character(date1), units = "days")) %>%
print.data.frame()
#> date1 date2 date_diff date_diff2
#> 1 2021-09-27 12:30:00 2022-01-06 11:23:00 100.9535 days 100.9951 days
#> 2 2021-09-27 12:30:00 2022-01-06 11:23:00 100.9535 days 100.9951 days
It only differs by ~0.04 in this case, but is there a reason why? And which one would be considered correct? Thank you!
The conversion to character is lossy because you lose the time zone infromation.
Your original datetimes are specified to be in UTC. If you
use as.character()
and reparse them, they get interpreted as your local time,
where it seems like one of the dates uses daylight savings and the other does not, resulting in an additional one hour difference.
x <- as.POSIXct(1632745800, tz = "UTC")
y <- as.POSIXct(1641468180, tz = "UTC")
x
#> [1] "2021-09-27 12:30:00 UTC"
as.character(x)
#> [1] "2021-09-27 12:30:00"
as.POSIXct(as.character(x))
#> [1] "2021-09-27 12:30:00 BST"
as.POSIXct(as.character(y))
#> [1] "2022-01-06 11:23:00 GMT"