rdatelubridate

Find dates that fail to parse in R Lubridate


As a R novice I'm pulling my hair out trying to debug cryptic R errors. I have csv that containing 150k lines that I load into a data frame named 'date'. I then use lubridate to convert this character column to datetimes in hopes of finding min/max date.

  dates <- csv[c('datetime')]
  dates$datetime <- ymd_hms(dates$datetime)

Running this code I receive the following error message:

Warning message:
3 failed to parse. 

I accept this as the CSV could have some janky dates in there and next run:

min(dates$datetime) 
max(dates$datetime)

Both of these return NA, which I assume is from the few broken dates still stored in the data frame. I've searched around for a quick fix, and have even tried to build a foreach loop to identify the problem dates, but no luck. What would be a simple way to identify the 3 broken dates?

example date format: 2015-06-17 17:10:16 +0000

Solution

  • Credit to LawyeR and Stibu from above comments:

    1. I first sorted the raw csv column and did a head() & tail() to find which 3 dates were causing trouble
    2. Alternatively which(is.na(dates$datetime)) was a simple one liner to also find the answer.