rdplyrzoo

rolling weighted mean by group with zoo and dplyr


I need to calculate the moving, weighted average by group in R.

Below is a sample of my code:

df %>% 
arrange(Type, Year)
group_by(Type) %>% 
mutate(
"weighted.rolling.mean" = rollapplyr(
    df %>% select(Value, Area),
    width = 50,
    function(z){
      return(
        weighted_mean = weighted.mean(z[,"Value"],z[,"Area"], partial = TRUE)
      )
    },
    by.column = FALSE,
    align = "right",
    fill=NA
  )
)

Without the group_by argument, the code works fine.

If I group_by inside the rollapplyr function, I get the error below:

Error in `mutate()`:
ℹ In argument: `weighted.rolling.mean = rollapplyr(...)`.
ℹ In group 1: `Type = "Etageadskillelse"`.
Caused by error in `x * w`:
! non-numeric argument to binary operator
Backtrace:
  1. ... %>% ...
  9. zoo::rollapplyr(...)
 11. zoo:::rollapply.default(..., align = align)
 14. zoo:::rollapply.zoo(zoo(data), ...)
 15. base::mapply(...)
 16. zoo (local) `<fn>`(dots[[1L]][[50L]], dots[[2L]][[50L]], data = `<chr[,4]>`)
 17. FUN(data[posns, ], ...)
 19. stats:::weighted.mean.default(z[, "Value"], z[, "Area"], partial = TRUE)

Many thanks in advance!

Update

Here's an example of my dataset:

structure(list(Value = c(0.55, 0.14, 0.760367571281347, 0.25, 
0.4, 0.12, 0.4, 0.13, 0.14, 0.344161801501251, 1.06, 0.58, 0.45, 
0.68, 0.36, 0.19, 0.25, 0.68, 0.08, 0.14, 1.74386666666667, 0.3, 
0.350285714285714, 0.07, 0.14), Area = c(3.5, 11.2, 87.33, 133, 
16.67, 112.3, 44, 281, 121, 119.9, 77, 82, 33.48, 102.1, 98.07, 
121.53, 54, 2.15, 15.48, 136, 30, 31.5, 70, 144.15, 100), Type = c("Ydervæg", 
"Kvist", "Ydervæg", "Ydervæg", "Ydervæg", "Loft", "Skunk", 
"Loft", "Loft", "Ydervæg", "Ydervæg", "Etageadskillelse", "Terrændæk", 
"Ydervæg", "Ydervæg", "Loft", "Loft", "Ydervæg", "Terrændæk", 
"Loft", "Etageadskillelse", "Terrændæk", "Skunk", "Terrændæk", 
"Loft"), Year = c(1965L, 2011L, 1966L, 1929L, 1890L, 1937L, 1926L, 
1846L, 1965L, 1920L, 1963L, 1936L, 1947L, 1920L, 1973L, 1967L, 
1915L, 1814L, 1964L, 2005L, 1950L, 1933L, 1929L, 1874L, 1964L
)), row.names = c(NA, -25L), class = c("tbl_df", "tbl", "data.frame"
))

Solution

  • Use pick rather than select, partial is misplaced, ensure that the argument passed to FUN is a matrix, use rollapplyr with an r on the end to avoid the need for align = "right". The question has not provided enough data to use a width of 50 so we used a width of 3.

    df %>% 
      arrange(Type, Year) %>%
      group_by(Type) %>% 
      mutate("weighted.rolling.mean" = rollapplyr(
        pick(Value, Area),
        width = 3,
        FUN = \(z, m = matrix(z, ncol = 2)) weighted.mean(m[,1], m[,2]),
        by.column = FALSE,
        fill=NA,
        partial = TRUE
      )
    )
    

    giving

    # A tibble: 25 × 5
    # Groups:   Type [6]
       Value  Area Type              Year weighted.rolling.mean
       <dbl> <dbl> <chr>            <int>                 <dbl>
     1  0.58  82   Etageadskillelse  1936                 0.58 
     2  1.74  30   Etageadskillelse  1950                 0.892
     3  0.14  11.2 Kvist             2011                 0.14 
     4  0.13 281   Loft              1846                 0.13 
     5  0.25  54   Loft              1915                 0.149
     6  0.12 112.  Loft              1937                 0.142
     7  0.14 100   Loft              1964                 0.154
     8  0.14 121   Loft              1965                 0.133
     9  0.19 122.  Loft              1967                 0.158
    10  0.14 136   Loft              2005                 0.156
    # ℹ 15 more rows
    # ℹ Use `print(n = ...)` to see more rows