I'm a freshman who is studying LDA (Latent Dirichlet Allocation) model nowadays. But, I faced a problem.
How is the theta drawn from the alpha?
theta ~ Dir(alpha)
According to my short understanding, the variable theta is a vector with its length K and its components represent the topic proportions in a document. And, the thetas are different with each other for each document. And, in corpus level, the alpha is still a K-vector whereas the theta is a M(# of docs) by K(# of topics) sized matrix.
First question: What I mentioned above is true?
Second question: If true, over the documents, how can the different thetas (K-vectors) be drawn from the same Dirichlet distribution?
First answer: Yes, you are exactly right.
Second answer: The alpha is a K-vector, as you mentioned. When we take a sample from the Dirichlet distribution, we get another K-vector. The values themselves would depend on the values of alpha, but they all sum to 1 (which is how they can be considered the proportions of all topics in one document). We sample once per document, to obtain M vectors - that's how we get the MxK matrix theta.
The length of the vector we get from sampling the Dirichlet distribution depends on the length of its parameter, alpha.