I am new at Stan and I'm struggling to understand the difference in how different variable declaration styles are used. In particular, I am confused about when I should put square brackets after the variable type and when I should put them after the variable name. For example, given int<lower = 0> L; // length of my data
, let's consider:
real N[L]; // my variable
versus
vector[L] N; // my variable
From what I understand, both declare a variable N as a vector of length L.
Is the only difference between the two that the first way specifies the variable type? Can they be used interchangeably? Should they belong do different parts of the Stan code (e.g., data
vs parameters
or model
)?
Thanks for explaining!
real name[size]
and vector[size] name
can be used pretty interchangeably. They are stored differently internally, so you can get better performance with one or the other. Some operations might also be restricted to one and the other (e.g. vector multiplication) and the optimal order to loop over them changes. E.g. with a matrix
vs. a 2-D array, it is more efficient to loop over rows first vs. columns first, but those will come up if you have a more specific example. The way to read this is:
real name[size];
means name
is an array of type real
, so a bunch of reals
that are stored together.
vector[size] name;
means that name is a vector of size size
, which is also a bunch of reals stored together. But the vector
data type in STAN is based on the eigen
c++ library (c++) and that allows for other operations.
You can also create arrays of vectors like this:
vector[N] name[K];
which is going to produce an array of K
vectors of size N
.
Bottom line: You can get any model running with using vector
or real
, but they're not necessarily equivalent in the computational efficiency.