I have data that looks like this:
library(tidyverse)
df<-tibble(date = as.Date("2024-07-01"), starting_n = 100, growth_rate = 0.05)%>%
bind_rows(
tibble(date = seq.Date(as.Date("2024-08-01"), as.Date("2024-09-01"), by = "month"), growth_rate = 0.05))
# A tibble: 3 × 3
date starting_n growth_rate
<date> <dbl> <dbl>
1 2024-07-01 100 0.05
2 2024-08-01 NA 0.05
3 2024-09-01 NA 0.05
I am trying to calculate starting_n
based on the initial starting_n
and growth_rate
such that the growth rate is applied cumulatively down the column like this:
# A tibble: 3 × 4
date starting_n growth_rate starting_n_calc
<date> <dbl> <dbl> <dbl>
1 2024-07-01 100 0.05 NA
2 2024-08-01 NA 0.05 105
3 2024-09-01 NA 0.05 110.25
I have tried the following:
df%>%mutate(starting_n_calc=lag(first(starting_n)*growth_rate+first(starting_n)))
date starting_n growth_rate starting_n_calc
<date> <dbl> <dbl> <dbl>
1 2024-07-01 100 0.05 NA
2 2024-08-01 NA 0.05 105
3 2024-09-01 NA 0.05 105
As shown, it only produces the first calculation and fills down, rather than dynamically calculating down the column.
I have tried a few other combinations of lag
, first
, and cumsum
but nothing seems to work. Please help! Thanks.
You could do it this way with cumprod
and lag
:
df |> mutate(starting_n_calc = lag((starting_n[1] * cumprod(growth_rate+1))))
#> # A tibble: 3 × 4
#> date starting_n growth_rate starting_n_calc
#> <date> <dbl> <dbl> <dbl>
#> 1 2024-07-01 100 0.05 NA
#> 2 2024-08-01 NA 0.05 105
#> 3 2024-09-01 NA 0.05 110.