rdatedummy-variable

Create a dummy for the last observation within each group


Sorry if my question is trivial. I have the following dataset, and I want to create a dummy variable equal to 1 for each unique product_id for the last data. Thanks for your help!

df <- structure(list(product_id = c(10000567L, 10000123L, 10000567L, 10000222L, 10000123L, 10000222L)
                , Date = c("12-12-2020", "26-11-2020", "02-11-2020", "09-10-2020", "21-09-2020", "10-09-2020"))
                , class = "data.frame"
                , row.names = c(NA, -6L)
                ) 


Solution

  • If I've interpreted your question correctly this may be one approach with dplyr:

    df1 <- structure(list(product_id = c(10000567L, 10000123L, 10000567L, 10000222L, 10000123L, 10000222L)
                         , Date = c("12-12-2020", "26-11-2020", "02-11-2020", "09-10-2020", "21-09-2020", "10-09-2020"))
                    , class = "data.frame"
                    , row.names = c(NA, -6L)
    ) 
    
    library(dplyr)
    
    df1 |>
      mutate(last_date = ifelse(max(as.Date(Date, "%d-%m-%Y")) == as.Date(Date, "%d-%m-%Y"),1, 0), .by = product_id)
    #>   product_id       Date last_date
    #> 1   10000567 12-12-2020         1
    #> 2   10000123 26-11-2020         1
    #> 3   10000567 02-11-2020         0
    #> 4   10000222 09-10-2020         1
    #> 5   10000123 21-09-2020         0
    #> 6   10000222 10-09-2020         0
    

    Created on 2023-06-23 with reprex v2.0.2