I am trying to run a Vector Autoregression model (VAR (1)) with heteroscedasticity in Stan. I can successfully run the model using JAGS but I don't know why Stan gives some errors when running the same model. Here is the data and model:
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
#Simulating data:
T <- 100
alpha <- 1
gamma_1 <- 1
gamma_2 <- 0.4
sigma <- y <- rep(NA, length = T)
set.seed(123)
sigma[1] <- runif(1)
y[1] <- 0
for (t in 2:T) {
sigma[t] <- sqrt(gamma_1 + gamma_2 * (y[t - 1] - alpha)^2)
y[t] <- rnorm(1, mean = alpha, sd = sigma[t])
}
df <- data.frame(y1 = y, y2 = y, x = rnorm(100,0,1))
model_code <- "
data{
int<lower=1> T; //Time
int<lower=2> K; //location
matrix[T,K] y; //Target variable
vector[T] x; //Covariate
}
parameters {
vector[K] alpha; //Modelling mean: intercept
real<lower=0> sigma; //Modelling y: variance
matrix[K,K] theta; //AR(1) coefficient matrix
row_vector[K] mu_t1 ; //Initial values of the AR process
vector[K] beta; //Covariate's effect coefficient
}
transformed parameters {
matrix[T,K] epsilon; //Residuals (innovation)
matrix[T,K] mu; //Mean of the process
mu[1,] = mu_t1 ; //Initial values of the time-series
epsilon[1,] = y[1,] - mu[1,];
for(k in 1:K){
for (t in 2:T){
mu[t,k] = alpha[k] + theta[k,] * epsilon[t - 1,]' + beta[k] * x[t];
epsilon[t,k] = y[t,k] - mu[t,k] ;
}
}
}
model{
//priors
for(k in 1:K){
alpha[k] ~ normal(0,3);
beta[k] ~ normal(0,10);
theta[k,] ~ normal(0,1);
}
mu_t1 ~ normal(7,1) ;
sigma ~ normal(0, 5);
//Model likelihood
for(k in 1:K){
for (t in 1:T)
y[t,k] ~ normal(mu[t,k], sigma);
}
}
"
model_data <- list(
T = nrow(df),
K = 2,
x = df$x,
y = df[,1:2]
)
stan_run <- stan(
data = model_data,
model_code = model_code
)
When I run this code, Stan stops before starting the sampling saying:
Chain 2: Rejecting initial value: Chain 2: Error evaluating the log probability at the initial value. Chain 2: Exception: normal_lpdf: Location parameter is nan, but must be finite! (in 'model290f30a800bc_9a829e355b070cb7ca3039bdb9dcc780' at line 43)
I am not sure why it can't evaluate the log probability at the initial value. I don't see anything wrong with my inputs. Does anyone know what is going wrong with my code?
I used the print()
statement in the transformed parameters block to see which value is Nan (as it's stated in the error). So this is what I did:
"
transformed parameters {
matrix[T,K] epsilon; //Residuals (innovation)
matrix[T,K] mu; //Mean of the process
mu[1,] = mu_t1 ; //Initial values of the time-series
epsilon[1,] = y[1,] - mu[1,];
for(k in 1:K){
for (t in 2:T){
mu[t,k] = alpha[k] + theta[k,] * epsilon[t - 1,]' + beta[k] * x[t];
epsilon[t,k] = y[t,k] - mu[t,k] ;
}
}
print('mu =', mu);
print('epsilon =', epsilon);
print('theta =', theta);
print('beta =', beta);
}
"
And parameter mu was the culprit. I find my mistake to be defining the loops incorrectly in the transformed parameters block. The k loop should come after the t loop and replacing them fixes the error and the model runs with no problem.