machine-learningneural-networkmissing-datanetflixrbm

Inferring missing data with Restricted Boltzmann Machines


Similar to the netflix competition, assume we have a movie dataset with missing ratings. How would I modify RBM to allow it to deduce the missing values? In related papers, one straightforward way is to impute random values to the missing visible features. However, I'm skeptical about the reconstruction accuracy because it can depend on the initial values given to these missing visible nodes.

What do you suggest?

Thanks


Solution

  • https://www.youtube.com/watch?v=laVC6WFIXjg , maybe this video will be of some help.

    I think that sampling after imputing random values is a good idea. Hinton justifies this in this video. Also you can try to estimate prior, or to do many samples, or to make guesses based on some different method and then do the reconstruction.

    In the video Hinton says that this method isn't very accurate indeed on itself, but when combined with matrix factorization (or other similar methods) can be very powerful.