rdplyr

Projections in R dataframe


I have a R dataframe with year, growth_rate and total for one year. I want to fill missing values by discounting using the growth_rate with increase with years after and decrease for years before. For instance, if the year with value is 2018, the value for 2019 is value for 2018 multiplied by 2018 growth rate, the value for 2020 is value for 2018 multiplied by 2018 growth rate and growth_rate value for 2019. Conversely, the value for 2017, would be the value of 2018 discounted by the grotwh_rate of 2017, the value for 2016 would be the value of 2018 discounted by the grotwh_rate of 2017 and growth_rate for 2016. Below is the example dataframe with the desired column being total_projection

Edited to include negative growth rates

# Example dataset
df <- data.frame(
  year = c(2016, 2017, 2018, 2019, 2021, 2022, 2023),
  growth_rate = c(0.0678912, 0.0526389, 0.099030, -0.830844, 0.830764, -0.656793, 1.253578),
  total = c(NA, NA, 1141262, NA, NA, NA, NA),
  total_projection = c(1139887, 1140661, 1141262, 1142392, 1132901, 1142312, 1134810)
)

Solution

  • One way would be to use the growth rates to determine an index, and then use the index from the known total to project out the others.

    Slight rounding delta, perhaps due to the known total value being rounded to begin with.

    df |>
      mutate(coef = cumprod(1+lag(growth_rate, 1, 0)/100), 
             total2 = coef / coef[year == 2018] * total[year == 2018])
    

    Result

      year growth_rate   total total_projection     coef  total2
    1 2016   0.0678912      NA          1139887 1.000000 1139888
    2 2017   0.0526389      NA          1140661 1.000679 1140662
    3 2018   0.0990300 1141262          1141262 1.001206 1141262
    4 2019   0.8308440      NA          1142392 1.002197 1142392
    5 2021   0.8307640      NA          1151884 1.010524 1151884
    6 2022   0.6567930      NA          1161453 1.018919 1161453
    7 2023   1.2535780      NA          1169081 1.025611 1169081