rvariablesnacalculationdifftime

Fill missing Variables by Information from other date columns (R)


I have a Dataframe which looks similar to this:

set.seed(42)   
start <- Sys.Date() + sort(sample(1:10, 5))
set.seed(43)
end <-  Sys.Date() + sort(sample(1:10, 5))
end[4] <- NA
A <- c("10", "15", "NA", "4", "NA")
B  <- rpois(n = 5, lambda = 10) 
df <- data.frame(start, end, A, B)

I would like , when there is an NA in the column A to caclulate the hours beweet start and end. Nothing shall happen when either start or end is NA.

I tried somthing like that:

df[, df$A [is.na(df[, df$A])]] <- difftime(df$end, df$start, units =  "hours")

but this gives me the Error: undefined columns selected.

Does someone have an Idea? Thanks.


Solution

  • Create an index where there are NA in 'A' column, subset the 'start', 'end' based on the index, get the difftime and assign back

    df$A <- as.numeric(df$A)
    i1 <- is.na(df$A)
    df$A[i1] <-  with(df, as.numeric(difftime(start[i1], end[i1], units = "hours")))