I have below string object
Obj = c('3122024', '04122023', '412024')
Basically, they all are dates with format mdY
Now I tried to convert them to proper date object as below,
library(lubridate)
mdy(Obj)
### [1] NA "2023-04-12" NA
So, only second element is properly converted to date object.
Is there any direct function available in R which can convert all elements to proper date object?
Thanks for your time.
Update. Many thanks to @r2evans:
Using lapply()
instead of sapply()
to retains the original date-class objects directly, so we can eliminate the need for converting numeric values back to dates using as.Date() with an origin.
library(lubridate)
library(stringr)
my_parse_dates <- function(x) {
dates <- lapply(x, function(date_str) {
year <- str_sub(date_str, -4)
remaining <- str_sub(date_str, 1, -5)
day_month <- switch(as.character(nchar(remaining)),
'2' = paste0('0', substr(remaining,1,1), '0', substr(remaining,2,2)),
'3' = paste0('0', remaining),
'4' = remaining,
stop("Invalid format!"))
full_date <- paste0(day_month, year)
dmy(full_date)
}) |> do.call(c, args=_)
unname(dates)
}
Obj <- c('3122024', '04122023', '412024')
my_parse_dates(Obj)
First answer:
As already noticed by @MrFlick the provided dates appear ambiguous due to the absence of leading zeros and inconsistent formatting.
Under certain circumstances, if the logic applies to all cases, one could try it with a custom function.
First verify that the last four digits represent a year; if so, designate these as the year.
Next, count the remaining characters:
If there are only two characters, prepend a '0' to each digit.
If there are three characters, prepend a '0' only before the first digit.
If there are four characters, leave them unchanged.
Finally, interpret the complete string using the day-month-year (dmy) format.
library(lubridate)
library(stringr)
Obj = c('3122024', '04122023', '412024')
my_parse_dates <- function(x) {
dates <- sapply(x, function(date_str) {
year <- str_sub(date_str, -4)
remaining <- str_sub(date_str, 1, -5)
day_month <- switch(as.character(nchar(remaining)),
'2' = paste0('0', substr(remaining,1,1), '0', substr(remaining,2,2)),
'3' = paste0('0', remaining),
'4' = remaining,
stop("Invalid format!"))
full_date <- paste0(day_month, year)
dmy(full_date)
}, USE.NAMES = FALSE)
as.Date(dates, origin = "1970-01-01") |> unname()
}
my_parse_dates(Obj)
output:
[1] "2024-12-03" "2023-12-04" "2024-01-04"