rzoorolling-computation

Create rolling windows with custom overlap


I'd like to analyse time-series data along multiple rolling windows. As a first step, I'd like to extract the start and end-times of said windows. Sadly, using zoo:rollapply seems to be missing a crucial parameter, namely the overlap, which seems to fixed at 1?

Example:

library(zoo)
a <- c(0:10)
output <- rollapply(a, 4, print)

output:

      [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    1    2    3    4
[3,]    2    3    4    5
[4,]    3    4    5    6
[5,]    4    5    6    7
[6,]    5    6    7    8
[7,]    6    7    8    9
[8,]    7    8    9   10

desired output (having an overlap of 2, for example):

     [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    2    3    4    5
[3,]    4    5    6    7
[4,]    6    7    8    9
[5,]    8    9    10    

any idea of how to get there?


Solution

  • A combination of by=2 for 2 steps per row and adding some NA's at the end makes it:

    > rollapply(c(v, NA, NA), width=4, FUN=`c`, by=2)
         [,1] [,2] [,3] [,4]
    [1,]    0    1    2    3
    [2,]    2    3    4    5
    [3,]    4    5    6    7
    [4,]    6    7    8    9
    [5,]    8    9   10   NA
    

    (Given that v <- 0:10 ).

    I thought fill=NA would make it, but it does not.

    c(v, NA, NA) the number of NA should be the number of by, eventually.

    If so, use the function:

    fill_up <- function(vec, by, default=NA) c(vec, rep(default, by))
    
    rolling_window <- function(vec, width, by, FUN=`c`, default=NA) {
      rollapply(fill_up(vec, by, default=default), width=width, by=by, FUN=FUN)
    }
    

    And call it by:

    rolling_window(vec=v, width=4, by=2)
    

    Value-Window

    If you want to have instead of the width a value window, then the output will be a list of vectors (because not any more a constant number of elements in one group/"row".

    I would write a new for that.

    require(zoo)
    
    v <- c(100, 200, 300, 500, 600, 900, 1000, 1200, 1300, 1500)
    
    rolling_absolute_window <- function(vec, window_width=300) {
      start_value <- vec[[1]]
      end_value <- vec[[length(vec)]]
      limits <- seq(from=start_value, to=end_value, by=window_width)
      limits_df <- rollapply(c(limits, Inf), width=2, FUN=`c`, by=1)
      lapply(as.data.frame(t(limits_df)), function(pair) {
        vec[pair[1] <= vec & vec <= pair[2]]
      })
    }
    
    rolling_absolute_window(v)
    
    ## this would return:
    
    $V1
    [1] 100 200 300
    
    $V2
    [1] 500 600
    
    $V3
    [1]  900 1000
    
    $V4
    [1] 1000 1200 1300
    
    $V5
    [1] 1300 1500
    
    rolling_absolute_window_excluding_last <- function(vec, window_width=300) {
      start_value <- vec[[1]]
      end_value <- vec[[length(vec)]]
      limits <- seq(from=start_value, to=end_value, by=window_width)
      limits_df <- rollapply(c(limits, Inf), width=2, FUN=`c`, by=1)
      lapply(as.data.frame(t(limits_df)), function(pair) {
        vec[pair[1] <= vec & vec < pair[2]]   # excludes last
      })
    }
    
    rolling_absolute_window_excluding_last(v)
    
    ## this returns:
    
    $V1
    [1] 100 200 300
    
    $V2
    [1] 500 600
    
    $V3
    [1] 900
    
    $V4
    [1] 1000 1200
    
    $V5
    [1] 1300 1500
    
    
    rolling_window <- function(vec, window_width=300) {
      result <- list()
      for (i in 1:length(vec)) {
        result[[i]] <- vec[ vec[i] <= vec & vec <= vec[i] + window_width]
      }
      result[-length(vec)]
    }
    
    rolling_window(v)
    
    ## this would return:
    
    [[1]]
    [1] 100 200 300
    
    [[2]]
    [1] 200 300 500
    
    [[3]]
    [1] 300 500 600
    
    [[4]]
    [1] 500 600
    
    [[5]]
    [1] 600 900
    
    [[6]]
    [1]  900 1000 1200
    
    [[7]]
    [1] 1000 1200 1300
    
    [[8]]
    [1] 1200 1300 1500
    
    [[9]]
    [1] 1300 1500
    
    
    
    rolling_window_excluding_last <- function(vec, window_width=300) {
      result <- list()
      for (i in 1:length(vec)) {
        result[[i]] <- vec[ vec[i] <= vec & vec < vec[i] + window_width]
      }
      result[-length(vec)]
    }
    
    rolling_window_excluding_last(v)
    
    ## which returns:
    
    [[1]]
    [1] 100 200 300
    
    [[2]]
    [1] 200 300
    
    [[3]]
    [1] 300 500
    
    [[4]]
    [1] 500 600
    
    [[5]]
    [1] 600
    
    [[6]]
    [1]  900 1000
    
    [[7]]
    [1] 1000 1200
    
    [[8]]
    [1] 1200 1300
    
    [[9]]
    [1] 1300 1500