rdatefilteringsubset

Filtering my data frame is erasing my data frame despite correct number of rows


I have a dataframe of transactions (roughly 76K rows). Each row has a column called START_DATE which is when the transaction started.

I am trying to filter down to transactions with START_DATE > 1/1/2023.

I am doing this:

str(A_23$START_DATE) ##IT STARTS OUT AS A CHR
A_23$START_DATE <- as.Date(A_23$START_DATE, format = "%Y-%m-%d")
str(A_23$START_DATE) ##THIS CONFIRMS IT IS A DATE
T_Date <- as.Date("2023-01-01", format = "%Y-%m-%d") ##USING A PLACEHOLDER VARIABLE TO VALIDATE STR()
str(T_Date) ##THIS CONFIRMS IT IS A DATE
A_23 <- A_23[A_23$START_DATE >= T_Date,] ##THIS CUTS THE DATAFRAME DOWN FROM 76K ROWS to 9K ROWS (IT WORKS)
head(A_23) ##EVERYTHING IS NA, THE ENTIRE FRAME

What am I doing wrong? Why is this causing all of my data to get erased but R knows how many rows?


Solution

  • Here is a possible solution:

    library(dplyr)
    library(lubridate)
    
        df <- data.frame(
          start_date = c("1/2/2021","5/11/2020","1/2/2021","5/11/2020"),
          item = c("A","B","C","D")
        )
        
        df <- df |> 
          mutate(start_date = as_date(dmy(start_date)))
        
        
        # filter date
        date_filter <- df |> 
          filter(!start_date %in% ymd("2021-02-01")) 
    
    date_filter
    
    
    
      start_date item
    1 2020-11-05    B
    2 2020-11-05    D
    

    Hope this helps!