Suppose I want to declare a matrix mat
with ncol
columns, where ncol
is a function of some data foo
. I could accomplish this with the below data
block, which requires I pass Stan a data file with values for ncol
.
How can I accomplish this within my Stan program without including ncol
in my data file?
data{
vector[2] foo;
int ncol; // assume ncol = sum(foo)
matrix[2, ncol] mat;
}
Below I describe some approaches I've considered, but I'm not sure whether and why they (don't) work.
Operate-and-assign to compute ncol
within the data
block. However, the Stan Reference Manual states that assignment is not allowed in the data
block. Therefore, this shouldn't work:
data{
vector[2] foo;
int ncol = sum(foo); // illegal: assignment isn't allowed in data!
matrix[2, ncol] mat;
}
Assign in the transformed_data
block. My understanding is that I can't pass data directly to objects in the transformed_data
block, which means I wouldn't be able to pass data to mat
if I moved it there (see below). Similarly, I couldn't fix this by keeping mat
in data
because its dimensions would then depend on transformed_data
, which is illegal.
data{
vector[2] foo;
}
transformed_data{
int ncol = sum(foo);
matrix[2, ncol] mat; // can't read and assign to `mat` because it's not in `data`?
}
Compute but do not assign the value of ncol
in the declaration of mat
. I don't think this would work unless foo
is always read in first and am not sure if it's legal anyway.
data{
vector[2] foo;
matrix[2, sum(foo)] mat; // possible illegal in data block?
}
Approach 3 succeeds at this while all of the other approaches do not. It is legal because:
data
block can be used in subsequent declarations within the data
block;data
block. This is a bit subtle, since it does compute values used to declare an object.In the following, I show that Approach 3 succeeds at doing what I sought to do in the question while the others do not.
rstan
extra data# Code that does not be changed across runs
# Packages
library(rstan)
# Program file name
relpath <- "example_stan_program.stan"
# Function to run the Stan program with specified presets
run_example <- function(model_code, data){
stan(model_code = model_code,
data = data,
iter = 1, warmup = 0,
chains = 1, cores = 1,
refresh = 0,
algorithm = "Fixed_param",
verbose = F)
}
# Baseline Stan program
baseline <- "
data{
array[2] int foo;
int bar; // assumes bar = sum(foo)
matrix[2, bar] mat;
}
transformed data{
print(foo);
print(bar);
print(mat);
}
"
# Baseline data
dat_baseline <- list(
foo = c(1,1),
bar = sum(c(1,1)),
mat = matrix(1:4, nrow = 2, ncol = 2)
)
# Run program
out <- run_example(baseline, dat_baseline)
#> [1,1]
#> 2
#> [[1,3],[2,4]]
data
blockoperate <- "
data{
array[2] int foo;
matrix[2, sum(foo)] mat;
}
transformed data{
print(foo);
print(mat);
}
"
# Example Data
dat_operate <- list(
foo = c(1,1),
mat = matrix(1:4, nrow = 2, ncol = 2)
)
# Run program (we could also get bar in transformed_data if desired)
out <- run_example(operate, dat_operate)
#> [1,1]
#> [[1,3],[2,4]]
data
(error)operate <- "
data{
array[2] int foo;
int bar = sum(foo); // cannot assign in data block
matrix[2, bar] mat;
}
transformed data{
print(foo);
print(bar);
print(mat);
}
"
# Example Data
dat_operate <- list(
foo = c(1,1),
mat = matrix(1:4, nrow = 2, ncol = 2)
)
# Run program
out <- run_example(operate, dat_operate)
#> Error in stanc(file = file, model_code = model_code, model_name = model_name, : 0
#>
#> Syntax error in 'string', line 3, column 12 to column 15, parsing error:
#>
#> Cannot assign to variables in the `data` or `parameters` blocks; expected ';'
#> after variable declaration.
transformed data
(no error but empty matrix)operate_tr <- "
data{
array[2] int foo;
}
transformed data{
int bar = sum(foo);
matrix[2, bar] mat; // empty because values not read in data block
print(foo);
print(bar);
print(mat);
}
"
# Example Data
dat_operate_tr <- list(
foo = c(1,1),
mat = matrix(1:4, nrow = 2, ncol = 2)
)
# Run program
out <- run_example(operate_tr, dat_operate_tr)
#> [1,1]
#> 2
#> [[nan,nan],[nan,nan]]