I have a dataset with a number of date columns in excel serial date format. I've managed to convert the dates to POSIXct format using the following simple mutate
myDataSet_wrangled <- myDataSet %>%
mutate(startDate = as.POSIXct(as.numeric(startDate) * 3600 * 24, origin = "1899-12-30", tz = "GMT"))
However, when I try to refactor this as a function of the form convertDate(df, ...), I can't seem to wrap my head around how to correctly indirect the column names. Frustratingly, the following code works with one column name, but when I pass multiple column names, it fails with an error "Error in 'mutate()': ... ! object 'endDate' not found"
myDataSet <- data.frame(
startDate = c(44197.924, 44258.363, 44320.634), # dates in Excel format
endDate = c(44201.131, 44270.859, 44330.023)
)
convertXlDateToPOSIXct <- function(df, ..., epoch = "1899-12-30", timezone = "GMT") {
cols <- enquos(...)
df <- df %>%
mutate(across(!!!cols, ~ as.POSIXct(as.numeric(.x) * 3600 * 24, origin = epoch, tz = timezone)))
return(df)
}
# Call with one column
myDataSet_wrangled <- myDataSet %>%
convertXlDateToPOSIXct(startDate)
# startDate correctly converted, no error thrown
# Call with multiple columns
myDataSet_wrangled <- myDataSet %>%
convertXlDateToPOSIXct(startDate,
endDate)
# 404: endDate Not Found
I've tried various combinations of ..., enquos, ensyms, and !!!, but I think I'm fundamentally misunderstanding how name masking works in R.
The R Documentation (topic-data-mask-programming {rlang}) makes some reference to forwarding of ... arguments not requiring special syntax, and demonstrates that you can call e.g. group_by(...)
.
I hadn't been able to work out why this syntax wasn't working in the code above, but (with thanks to @lotus) I've realised that real problem isn't that ... isn't properly enquo'd or ensym'd, but that across wants a single argument, rather than five or six or n arguments which are forwarded when passing ...; encapsulating ... with c() provides the column names in the expected format.
convertXlDateToPOSIXct <- function(df, ..., epoch = "1899-12-30", timezone = "GMT") {
df <- df %>%
mutate(across(c(...), ~ as.POSIXct(as.numeric(.x) * 3600 * 24, origin = epoch, tz = timezone)))
Alternatively, without the enclosing c(), calling with convertXlDateToPOSIXct(df, c(startDate, endDate))
would also work correctly, although it would make more sense to use a named parameter (e.g. convertXlDateToPOSIXct <- function(df, cols, epoch = "1899-12-30", timezone = "GMT"
)