rfunctionreplacena

Replace the last non NA entry of a row with the value from another column


I have a dataset like this:

  tdf <- data.frame(
    A = c(1, 2, 5, 4),
    B = c(NA, 3, 4, 5),
    C = c(NA, NA, NA, NA),
    D = c(NA, NA, NA, NA),
    E = c(10, 20, 30, 40))

And I need to replace the last non NA value going from left to right, with a value from another column, say E, to give this:

  enddf <- data.frame(
    A = c(10, 2, 5, 4),
    B = c(NA, 20, 30, 40),
    C = c(NA, NA, NA, NA),
    D = c(NA, NA, NA, NA),
    E = c(10, 20, 30, 40))

I'm reading multiple dfs of varying numbers of columns, so I'm thinking a function which reads across all rows but the last one would work?

I tried something like this, but I feel like I'm going the wrong direction:

df <- tdf %>%
    rowwise() %>%
    mutate(
      A = ifelse(is.na(A), A, ifelse(!is.na(B), C, A)),
      B = ifelse(!is.na(A) & is.na(B), C, B)
    )

Solution

  • A base R approach that looks for NA row-wise with apply.

    wcol <- ncol(tdf)
    
    data.frame(t(apply(tdf, 1, \(x) {
      x[which(is.na(x))[1] - 1] <- x[wcol]
      x
    })))
    

    output

       A  B  C  D  E
    1 10 NA NA NA 10
    2  2 20 NA NA 20
    3  5 30 NA NA 30
    4  4 40 NA NA 40
    

    Since the values are all numeric no conversion occurs with the intermediate matrix result.