I have the following dataset:
df <- data.frame(fractile = 1:10) %>% mutate(bracketaverage = fractile*100, topaverage = 10:19)
Which looks like:
fractile bracketaverage topaverage
1 1 100 10
2 2 200 11
3 3 300 12
4 4 400 13
5 5 500 14
6 6 600 15
7 7 700 16
8 8 800 17
9 9 900 18
10 10 1000 19
I wish to modify the topaverage column by utilizing input from the remaining columns according to the specified formula:
topaverage = ( (100-lag(fractile) ) * lag(topaverage) + ( lag(fractile) - fractile ) * bracket_average ) / (100-fractile)
Hence, the formula calculates the topaverage recursively, meaning that each iteration relies on the topaverage from the preceding step.
While many of purrr
's functions have func2
-versions for two arguments, two factors make it not easy to use them here: you need three variables, and you need both the lag and current version of at least one of them. Because of that, I think we can purrr::accumulate
over the row indices themselves.
library(dplyr)
library(purrr) # accumulate
df |>
mutate(
val = accumulate(
row_number(), .init = first(topaverage),
.f = function(ta, i) if (i == 1) ta else {
( (100 - fractile[i-1]) * topaverage[i-1] +
(fractile[i-1] - fractile[i]) * bracketaverage[i] ) /
(100 - fractile[i])
})[-1]
)
# fractile bracketaverage topaverage val
# 1 1 100 10 10.000000
# 2 2 200 11 8.061224
# 3 3 300 12 8.020619
# 4 4 400 13 7.958333
# 5 5 500 14 7.873684
# 6 6 600 15 7.765957
# 7 7 700 16 7.634409
# 8 8 800 17 7.478261
# 9 9 900 18 7.296703
# 10 10 1000 19 7.088889
where fractile[i-1]
is giving us lag(fractile)
and fractile[i]
is giving us the current-row fractile
.
I assume that the first topaverage
is retained unmodified. If that is not true, then this might need adjusting to account for aligning in the other direction ([i]
and [i+1]
instead, updating the leading conditional).