I'm working with a nested dataframe using dplyr. I need to mutate a second nested column with a copy of the data where all numeric columns are replaced with a trailing rolling average of the last 5 rows.
How do I either specify to zoo that I don't want the first five NAs or create a custom function that gives the desired output?
I've tried both rollmean
with the partial = TRUE
argument and rollapply
with a custom function with na.rm = TRUE
from the zoo package, but the first 4 rows are turned into NAs, which I don't want.
library(tidyverse)
library(zoo)
example <-
tibble("index" = c(rep(1, 5), rep(2, 5)), "data_a" = c(1:3, 1:2, 1:3, 1:2), "data_b" = c(2:4, 2:3, 2:4, 2:3)) %>%
group_by(index) %>%
nest()
example_ra <- example %>%
mutate(roll_mean = map(data, ~ mutate(.x, across(
where(is.numeric),
~rollmean(
.,
k = 5,
fill = NA,
partial = TRUE,
align = "right"
)
))))
My desired output (as a second list-column named roll_mean) is:
Input B | Input A |
---|---|
1 | 2 |
2 | 3 |
3 | 4 |
1 | 2 |
2 | 3 |
Output B | Output A |
---|---|
1 | 2 |
1.5 | 2.5 |
2 | 3 |
1.75 | 2.75 |
1.8 | 2.8 |
I get:
Output B | Output A |
---|---|
NA | NA |
NA | NA |
NA | NA |
NA | NA |
1.8 | 2.8 |
Thanks :)
There are several problems:
rollmean
does not have a partial=
argument. Use rollapplyr
. Note r
on the end to avoid needing the right=
argument.function
With these changes
example_ra <- example %>%
mutate(roll_mean = map(data, ~ mutate(.x, across(
.cols = where(is.numeric),
.fns = function(x) rollapplyr(x, width = 5, FUN = mean, partial = TRUE)
))))