My understanding is that to generate new data in RBM I would need to pass in real data. Is there a way to get generated data without real data? Like how VAE and GAN samples latent variable from prior distribution to generate data.
If so, in the case of labeled dataset like MNIST, how can I generate data from a specific class? Do I need to train 10 different RBM models for each digit?
My understanding is that to generate new data in RBM I would need to pass in real data. Is there a way to get generated data without real data? Like how VAE and GAN samples latent variable from prior distribution to generate data.
Yes, of course. This is actually the process that is happening in the negative phase of the training. You're sampling from a joint distribution, therefore letting the network "dream" of what it has been trained for. I guess this depends on your implementation, but I've been able to do that by initializing inputs as zeros and running Gibbs sampling for a few iterations. The result, as I interpret it, is that I should see "number-looking things" in the visible nodes, not necessarily numbers from your dataset.
This is an example I like, trained on MNIST, and sampled without any nodes clamped:
To your second question:
If so, in the case of labeled dataset like MNIST, how can I generate data from a specific class? Do I need to train 10 different RBM models for each digit?
What you can do when using labeled data is to use your labels as additional visible nodes. Check "Training Restricted Boltzmann Machines: An Introduction" Figure 2.
Also, for both these cases I'm thinking that using other sampling techniques that gradually lower the sampling temperature (e.g. Simulated Annealing) , will give you better results.