I have a dataset like this:
tdf <- data.frame(
A = c(1, 2, 5, 4),
B = c(NA, 3, 4, 5),
C = c(NA, NA, NA, NA),
D = c(NA, NA, NA, NA),
E = c(10, 20, 30, 40))
And I need to replace the last non NA value going from left to right, with a value from another column, say E, to give this:
enddf <- data.frame(
A = c(10, 2, 5, 4),
B = c(NA, 20, 30, 40),
C = c(NA, NA, NA, NA),
D = c(NA, NA, NA, NA),
E = c(10, 20, 30, 40))
I'm reading multiple dfs of varying numbers of columns, so I'm thinking a function which reads across all rows but the last one would work?
I tried something like this, but I feel like I'm going the wrong direction:
df <- tdf %>%
rowwise() %>%
mutate(
A = ifelse(is.na(A), A, ifelse(!is.na(B), C, A)),
B = ifelse(!is.na(A) & is.na(B), C, B)
)
A base R approach that looks for NA
row-wise with apply
.
wcol <- ncol(tdf)
data.frame(t(apply(tdf, 1, \(x) {
x[which(is.na(x))[1] - 1] <- x[wcol]
x
})))
output
A B C D E
1 10 NA NA NA 10
2 2 20 NA NA 20
3 5 30 NA NA 30
4 4 40 NA NA 40
Since the values are all numeric no conversion occurs with the intermediate matrix result.