I have a data frame of values across successive years (columns) for unique individuals (rows). A dummy data example is provided here:
dt = structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), `2015` = c(0,
0.8219178, 0, 0.1369863, 0, 1.369863, 0.2739726, 0.8219178, 5,
0), `2016` = c(0, 1.369863, 0, 0.2739726, 0, 0.2739726, 0, 3.2876712,
0, 0), `2017` = c(0.6849315, 0, 0, 0.6849315, 0, 0.5479452, 0,
0, 0, 0), `2018` = c(1.0958904, 0.5479452, 1.9178082, 0, 0, 0,
0, 0, 0, 3), `2019` = c(0, 0, 0, 1.0958904, 0, 0.9589041, 0.5479452,
0, 0, 0), `2020` = c(0.4383562, 0, 0, 0, 0.2739726, 0.6849315,
0, 0, 0, 0)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-10L))
I want to create a dataset where the maximum value for each individual that should appear for each year is 1. In cases where it exceeds this value, I want to carry the excess value over 1 into the next year (column) and sum it to that year's value for each individual and so on.
The expected result is:
dt_expected = structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), `2015` = c(0,
0.8219178, 0, 0.1369863, 0, 1, 0.2739726, 0.8219178, 1, 0), `2016` = c(0,
1, 0, 0.2739726, 0, 0.6438356, 0, 1, 1, 0), `2017` = c(0.6849315,
0.369863, 0, 0.6849315, 0, 0.5479452, 0, 1, 1, 0), `2018` = c(1,
0.5479452, 1, 0, 0, 0, 0, 1, 1, 1), `2019` = c(0.0958904, 0,
0.9178082, 1, 0, 0.9589041, 0.5479452, 0.2876712, 1, 1), `2020` = c(0.4383562,
0, 0, 0.0958904, 0.2739726, 0.6849315, 0, 0, 0, 1)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -10L))
I am at a total loss of where to start with this problem, so any assistance achieving this using data.table
would be greatly appreciated. My only thought is to use lapply
with an ifelse
function for the conditional component. Then should I be using rowSums
or Reduce
to achieve my outcome of shifting excess values across columns?
A translation of Martin Morgan's answer to data.table:
for (i in 2:(ncol(dt) - 1)) {
x = dt[[i]]
set(dt, j = i, value = pmin(x, 1))
set(dt, j = i + 1, value = dt[[i + 1L]] + pmax(x - 1, 0))
}