I have a dataframe of locations and dates across the US for which I want to retrieve the time of sunset. I used tz_lookup_coords
in the lutz package to define the timezone based on location, but when I feed this into bioRad sunrise
I get an error message that the tz value is invalid.
# create dataframe
data <- data.frame(id = 1:10,
date = c("2018-02-05", "2018-12-29", "2018-05-25", "2018-02-19",
"2017-02-09", "2017-10-05", "2018-02-18",
"2017-11-27", "2017-10-13", "2018-12-03"),
wgs_x = c(-105.12782, -73.70111, -119.89776, -157.94036,
-85.62744, -87.73867, -90.03440 , -97.39539,
-112.34498, -83.06242),
wgs_y = c(39.98948, 41.03264, 36.84011, 21.33720, 42.88368,
30.42648, 35.20090, 27.68490, 34.62111, 42.39886))
data$date <- as.Date(data$date)
# define timezones
data$timezone <- tz_lookup_coords(data$wgs_y, data$wgs_x, method = "accurate", warn = F)
# define sunrise time
data$sunrise <- sunrise(date = data$date,
lon = data$wgs_x,
lat = data$wgs_y,
tz = data$timezone)
Lacking knowing what you're seeing, I'll try to reproduce based on what is being done inside bioRad::sunrise
... namely, as.POSIXct(date, tz)
.
data$timezone <- lutz::tz_lookup_coords(data$wgs_y, data$wgs_x, method = "accurate", warn = F)
data$timezone
# [1] "America/Denver" "America/New_York" "America/Los_Angeles" "Pacific/Honolulu" "America/Detroit" "America/Chicago" "America/Chicago" "America/Chicago" "America/Phoenix" "America/Detroit"
So far so good. However, based on https://github.com/adokter/bioRad/blob/master/R/sunrise_sunset.R#L129, I'll try
as.POSIXct(data$date, tz = data$timezone)
# Error in strptime(xx, f, tz = tz) : invalid 'tz' value
(For future questions, it is often very insightful to include the actual error message. In this case, it would have pointed much faster to a call to base R functions like as.POSIXct
or strptime
, and not slogging through other possible causes.)
What is not well documented in ?as.POSIXct
tz: a character string. The time zone specification to be used
for the conversion, _if one is required_. System-specific
(see time zones), but ‘""’ is the current time zone, and
‘"GMT"’ is UTC (Universal Time, Coordinated). Invalid values
are most commonly treated as UTC, on some platforms with a
warning.
is that tz=
must be length 1. This is because a vector of POSIXt
values in R must all have the same timezone. That is, one cannot have two timestamps in one vector with different timezones, the "tzone"
attribute is applied to the vector as a whole. (A column of a frame is just a vector.)
To help prove this point,
as.POSIXct(data$date, tz = data$timezone[1])
# [1] "2018-02-05 MST" "2018-12-29 MST" "2018-05-25 MDT" "2018-02-19 MST" "2017-02-09 MST" "2017-10-05 MDT" "2018-02-18 MST" "2017-11-27 MST" "2017-10-13 MDT" "2018-12-03 MST"
Though, depending on your data, that might alter some time values.
Another approach would be to convert each per their specific timezones, and then combine them together. A first stab might use mapply
, but this tends to strip the class:
mapply(as.POSIXct, data$date, tz = data$timezone)
# 2018-02-05 2018-12-29 2018-05-25 2018-02-19 2017-02-09 2017-10-05 2018-02-18 2017-11-27 2017-10-13 2018-12-03
# 1517814000 1546059600 1527231600 1519034400 1486616400 1507179600 1518933600 1511762400 1507878000 1543813200
We can fix that by using Map
and do.call(c, ..)
:
do.call(c, Map(as.POSIXct, data$date, tz = data$timezone))
# 2018-02-05 2018-12-29 2018-05-25 2018-02-19 2017-02-09 2017-10-05 2018-02-18 2017-11-27 2017-10-13
# "2018-02-05 00:00:00 MST" "2018-12-28 22:00:00 MST" "2018-05-25 01:00:00 MDT" "2018-02-19 03:00:00 MST" "2017-02-08 22:00:00 MST" "2017-10-04 23:00:00 MDT" "2018-02-17 23:00:00 MST" "2017-11-26 23:00:00 MST" "2017-10-13 01:00:00 MDT"
# 2018-12-03
# "2018-12-02 22:00:00 MST"
Note that R tends to think of dates as UTC, and then converting to timezone-based timestamps does produce different times. Since, as I said earlier, all POSIXt values in a vector must all share the same timezone, all of these are converted to the time in the first timezone, though you can clearly see that the time-of-day is different for each.
Having said that ... these all point to the same UTC-date converted to a time. If you convert each of those timestamps from MDT to the timezone returned by tz_lookup_coords
, they will return back to midnight:
do.call(c, Map(as.POSIXct, data$date, tz = data$timezone)) |>
Map(f = function(tm, tz) `attr<-`(tm, "tzone", tz), data$timezone)
# $`2018-02-05`
# [1] "2018-02-05 MST"
# $`2018-12-29`
# [1] "2018-12-29 EST"
# $`2018-05-25`
# [1] "2018-05-25 PDT"
# $`2018-02-19`
# [1] "2018-02-19 HST"
# $`2017-02-09`
# [1] "2017-02-09 EST"
# $`2017-10-05`
# [1] "2017-10-05 CDT"
# $`2018-02-18`
# [1] "2018-02-18 CST"
# $`2017-11-27`
# [1] "2017-11-27 CST"
# $`2017-10-13`
# [1] "2017-10-13 MST"
# $`2018-12-03`
# [1] "2018-12-03 EST"