I'd like to analyse time-series data along multiple rolling windows. As a first step, I'd like to extract the start and end-times of said windows. Sadly, using zoo:rollapply
seems to be missing a crucial parameter, namely the overlap, which seems to fixed at 1
?
Example:
library(zoo)
a <- c(0:10)
output <- rollapply(a, 4, print)
output:
[,1] [,2] [,3] [,4]
[1,] 0 1 2 3
[2,] 1 2 3 4
[3,] 2 3 4 5
[4,] 3 4 5 6
[5,] 4 5 6 7
[6,] 5 6 7 8
[7,] 6 7 8 9
[8,] 7 8 9 10
desired output (having an overlap of 2, for example):
[,1] [,2] [,3] [,4]
[1,] 0 1 2 3
[2,] 2 3 4 5
[3,] 4 5 6 7
[4,] 6 7 8 9
[5,] 8 9 10
any idea of how to get there?
A combination of by=2
for 2 steps per row and adding some NA's at the end makes it:
> rollapply(c(v, NA, NA), width=4, FUN=`c`, by=2)
[,1] [,2] [,3] [,4]
[1,] 0 1 2 3
[2,] 2 3 4 5
[3,] 4 5 6 7
[4,] 6 7 8 9
[5,] 8 9 10 NA
(Given that v <- 0:10
).
I thought fill=NA
would make it, but it does not.
c(v, NA, NA)
the number of NA
should be the number of by
, eventually.
If so, use the function:
fill_up <- function(vec, by, default=NA) c(vec, rep(default, by))
rolling_window <- function(vec, width, by, FUN=`c`, default=NA) {
rollapply(fill_up(vec, by, default=default), width=width, by=by, FUN=FUN)
}
And call it by:
rolling_window(vec=v, width=4, by=2)
If you want to have instead of the width
a value window, then the output will be a list of vectors (because not any more a constant number of elements in one group/"row".
I would write a new for that.
require(zoo)
v <- c(100, 200, 300, 500, 600, 900, 1000, 1200, 1300, 1500)
rolling_absolute_window <- function(vec, window_width=300) {
start_value <- vec[[1]]
end_value <- vec[[length(vec)]]
limits <- seq(from=start_value, to=end_value, by=window_width)
limits_df <- rollapply(c(limits, Inf), width=2, FUN=`c`, by=1)
lapply(as.data.frame(t(limits_df)), function(pair) {
vec[pair[1] <= vec & vec <= pair[2]]
})
}
rolling_absolute_window(v)
## this would return:
$V1
[1] 100 200 300
$V2
[1] 500 600
$V3
[1] 900 1000
$V4
[1] 1000 1200 1300
$V5
[1] 1300 1500
rolling_absolute_window_excluding_last <- function(vec, window_width=300) {
start_value <- vec[[1]]
end_value <- vec[[length(vec)]]
limits <- seq(from=start_value, to=end_value, by=window_width)
limits_df <- rollapply(c(limits, Inf), width=2, FUN=`c`, by=1)
lapply(as.data.frame(t(limits_df)), function(pair) {
vec[pair[1] <= vec & vec < pair[2]] # excludes last
})
}
rolling_absolute_window_excluding_last(v)
## this returns:
$V1
[1] 100 200 300
$V2
[1] 500 600
$V3
[1] 900
$V4
[1] 1000 1200
$V5
[1] 1300 1500
rolling_window <- function(vec, window_width=300) {
result <- list()
for (i in 1:length(vec)) {
result[[i]] <- vec[ vec[i] <= vec & vec <= vec[i] + window_width]
}
result[-length(vec)]
}
rolling_window(v)
## this would return:
[[1]]
[1] 100 200 300
[[2]]
[1] 200 300 500
[[3]]
[1] 300 500 600
[[4]]
[1] 500 600
[[5]]
[1] 600 900
[[6]]
[1] 900 1000 1200
[[7]]
[1] 1000 1200 1300
[[8]]
[1] 1200 1300 1500
[[9]]
[1] 1300 1500
rolling_window_excluding_last <- function(vec, window_width=300) {
result <- list()
for (i in 1:length(vec)) {
result[[i]] <- vec[ vec[i] <= vec & vec < vec[i] + window_width]
}
result[-length(vec)]
}
rolling_window_excluding_last(v)
## which returns:
[[1]]
[1] 100 200 300
[[2]]
[1] 200 300
[[3]]
[1] 300 500
[[4]]
[1] 500 600
[[5]]
[1] 600
[[6]]
[1] 900 1000
[[7]]
[1] 1000 1200
[[8]]
[1] 1200 1300
[[9]]
[1] 1300 1500