rfunctionzoorolling-computationautoregressive-models

Applying user defined function to rolling window with zoo


I am working on a research project, and I wanted to recreate a market efficiency measure I have read about in an article. Since I am working on a large data set I decided to automate the process in R. First, I defined a function which returns the standardized beta coefficients used in the measure, here showed with a reproducible example:

beta_hats = function(j) {
  step1 = ar(j, aic = TRUE)$asy.var.coef
  step2 = ar(j, aic = TRUE)$ar
  step3 = chol(step1)
  step4 = t(step3)
  step5 = solve(step4)
  step6 = step5 %*% step2
  step7 = abs(step6)
  step8 = sum(step7)
    return(step8)
}

repro = data.frame(rnorm(3000, 0.0003563425, 0.0216025))
beta_hats(repro)

> beta_hats(repro)
[1] 1.587869

This generates the desired outcome for the entire data set, however, I want my measure to be time-varying so I attempted to repeat the function over rolling windows.

y = repro
t = 250
library(zoo)
z = rollapplyr(y, t, function(y) beta_hats(y))

Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), :
'data' must be of a vector type, was 'NULL'

At this point the function no longer works. Can anyone help me solve this issue?

Additional information:


Solution

  • This is an error produced when you want to carry out a cholesky decomposition of a NULL object:

    chol(NULL)
    Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
      'data' must be of a vector type, was 'NULL'
    

    This shows that the problem lies within your data rather than in the rollapply function. try regenerating the data and call your function on the data again. The asymptotic-theory variance matrix of the coefficient estimates seems to be NULL. note that they are given provided order>0

    Eg:

    set.seed(1)
    repro = data.frame(a=rnorm(3000, 0.0003563425, 0.0216025))
    ar(repro$a, aic =TRUE)$order
    [1] 0
    

    Since the order is 0, the assymptotic theory variance for this dataset from step1 will be NULL:

     ar(repro$a, aic =TRUE)$asy.var.coef
     [1] NULL
    

    hence step3 of your function will throw the error you have. You need to run your function in a valid dataset.

    Also note that although the function might not throw an error in the full dataset, it might end up throwing an error if you use a subset due to the reasons stated above