machine-learninggenerative-adversarial-networkrbm

Generating data from restricted Boltzmann machine


My understanding is that to generate new data in RBM I would need to pass in real data. Is there a way to get generated data without real data? Like how VAE and GAN samples latent variable from prior distribution to generate data.

If so, in the case of labeled dataset like MNIST, how can I generate data from a specific class? Do I need to train 10 different RBM models for each digit?


Solution

  • My understanding is that to generate new data in RBM I would need to pass in real data. Is there a way to get generated data without real data? Like how VAE and GAN samples latent variable from prior distribution to generate data.

    Yes, of course. This is actually the process that is happening in the negative phase of the training. You're sampling from a joint distribution, therefore letting the network "dream" of what it has been trained for. I guess this depends on your implementation, but I've been able to do that by initializing inputs as zeros and running Gibbs sampling for a few iterations. The result, as I interpret it, is that I should see "number-looking things" in the visible nodes, not necessarily numbers from your dataset.

    This is an example I like, trained on MNIST, and sampled without any nodes clamped:

    Image generated from an RBM trained on MNIST. The image looks like a question mark even though that doesn't exist in the training set.

    To your second question:

    If so, in the case of labeled dataset like MNIST, how can I generate data from a specific class? Do I need to train 10 different RBM models for each digit?

    What you can do when using labeled data is to use your labels as additional visible nodes. Check "Training Restricted Boltzmann Machines: An Introduction" Figure 2.

    Also, for both these cases I'm thinking that using other sampling techniques that gradually lower the sampling temperature (e.g. Simulated Annealing) , will give you better results.