[SOLVED] In LDA model, how are the multinomial parameters (theta) drawn from the Dirichlet prior weight (alpha)?

In LDA model, how are the multinomial parameters (theta) drawn from the Dirichlet prior weight (alpha)?

I'm a freshman who is studying LDA (Latent Dirichlet Allocation) model nowadays. But, I faced a problem.

How is the theta drawn from the alpha?

theta ~ Dir(alpha)

According to my short understanding, the variable theta is a vector with its length K and its components represent the topic proportions in a document. And, the thetas are different with each other for each document. And, in corpus level, the alpha is still a K-vector whereas the theta is a M(# of docs) by K(# of topics) sized matrix.

First question: What I mentioned above is true?

Second question: If true, over the documents, how can the different thetas (K-vectors) be drawn from the same Dirichlet distribution?

Solution

First answer: Yes, you are exactly right.

Second answer: The alpha is a K-vector, as you mentioned. When we take a sample from the Dirichlet distribution, we get another K-vector. The values themselves would depend on the values of alpha, but they all sum to 1 (which is how they can be considered the proportions of all topics in one document). We sample once per document, to obtain M vectors - that's how we get the MxK matrix theta.

The length of the vector we get from sampling the Dirichlet distribution depends on the length of its parameter, alpha.