I have this panel data that varies according to countries, years and groups. A same country in the same year can have data for different groups, or not. The panel is not balanced, so it does not cover the same time span for each country or each group.
For simplicity, the data looks something like this:
library(tibble)
df <- tibble(
country = c("ARG", "ARG", "ARG", "ARG", "ARG", "ARG", "ARG", "ARG", "BRA", "BRA", "BRA", "BRA"),
group = c("A", "A", "A", "A", "A", "B", "B", "B", "A", "A", "A", "A"),
year = c(1990, 1991, 1992, 1993, 1994, 1992, 1993, 1994, 1990, 1991, 1992, 1993),
value = c(346, 573547, 534, 4645, 454, 3453, 64, 75, 346, 3465, 46, 876)
)
country group year value
<chr> <chr> <dbl> <dbl>
1 ARG A 1990 346
2 ARG A 1991 573547
3 ARG A 1992 534
4 ARG A 1993 4645
5 ARG A 1994 454
6 ARG B 1992 3453
7 ARG B 1993 64
8 ARG B 1994 75
9 BRA A 1990 346
10 BRA A 1991 3465
11 BRA A 1992 46
12 BRA A 1993 876
What I'm trying to do is get a new column with moving average of 2 periods for each group in each country using value data. How do I do this?
If you are just trying to compute the rolling average within a by group, you could do:
library(dplyr)
library(zoo)
df |>
mutate(ma = rollmean(value, 2, na.pad = TRUE, align = "right"),
.by = c(country, group))
Output
country group year value ma
<chr> <chr> <dbl> <dbl> <dbl>
1 ARG A 1990 346 NA
2 ARG A 1991 573547 286946.
3 ARG A 1992 534 287040.
4 ARG A 1993 4645 2590.
5 ARG A 1994 454 2550.
6 ARG B 1992 3453 NA
7 ARG B 1993 64 1758.
8 ARG B 1994 75 69.5
9 BRA A 1990 346 NA
10 BRA A 1991 3465 1906.
11 BRA A 1992 46 1756.
12 BRA A 1993 876 461