rtimezoneposixctmutate

issue with time zones in dataframe and POSIXct data


I have a bunch of timedates with different time zones - I am able to generate a dataframe with these as different columns. It looks something like this:

timedate <- c("2024-03-31T03:14:00", "2024-03-24T01:07:00", "2024-04-09T22:45:00")
zones <- c("UTC", "Australia/Brisbane", "Australia/Canberra")
df <- tibble(timedate, zones)
df
> A tibble: 3 × 2
> timedate            zones             
>   <chr>               <chr>             
> 1 2024-03-31T03:14:00 UTC               
> 2 2024-03-24T01:07:00 Australia/Brisbane
> 3 2024-04-09T22:45:00 Australia/Canberra

I know that the following works to create a col of POSIX class data...

df <- mutate(df
        ,timedatezone = as.POSIXct(timedate,format="%Y-%m-%dT%H:%M", tz = 'UTC'))
df
> A tibble: 3 × 3
> timedate            zones              timedatezone       
>   <chr>               <chr>              <dttm>             
> 1 2024-03-31T03:14:00 UTC                2024-03-31 03:14:00
> 2 2024-03-24T01:07:00 Australia/Brisbane 2024-03-24 01:07:00
> 3 2024-04-09T22:45:00 Australia/Canberra 2024-04-09 22:45:00

but... the following does not

df <- mutate(df
        ,timedatezone = as.POSIXct(timedate,format="%Y-%m-%dT%H:%M", tz = zones))

> Error in `mutate()`:
> In argument: `timedatezone = as.POSIXct(timedate, format = "%Y-%m-%dT%H:%M", tz = zones)`.
> Caused by error in `strptime()`:
> invalid 'tz' value
> Run `rlang::last_trace()` to see where the error occurred.

I feel like I am missing something incredibly basic, but I have googled and googled and I cannot figure out a fix.

Ultimately, I would like a column with values which know which timezone they are in or a column will all values in a single time zone (where I know what that timezone is).


Solution

  • I suggest you use the datetimeoffset package.

    df <- data.frame(timedate = c("2024-03-31T03:14:00", "2024-03-24T01:07:00", "2024-04-09T22:45:00"), 
                     zones = c("UTC", "Australia/Brisbane", "Australia/Canberra"))
    
    library(datetimeoffset)
    df$parsed <- as_datetimeoffset(df$timedate, tz = df$zones) |> as.POSIXct(tz = "UTC")
    df
    #             timedate              zones              parsed
    #1 2024-03-31T03:14:00                UTC 2024-03-31 03:14:00
    #2 2024-03-24T01:07:00 Australia/Brisbane 2024-03-23 15:07:00
    #3 2024-04-09T22:45:00 Australia/Canberra 2024-04-09 12:45:00