As I understand, when using softmax of K values in RBM visible units, the hidden unit stays binary.
If so - I'm not sure how to compute contributions by the binary units to the visible ones. Am I supposed to relate the binary 0 state in a hidden unit to a specific state out of the K states of the softmax, and the 1 state to the other K-1 states? Or maybe a 0 in the hidden unit correlates to 0 in all of the K possible states of the visible unit (but doesn't it contradict the fact that at least one of the K states must be on?).
I think I've figured out my misunderstanding: The softmax units behave as groups of binary subunits, and each subunit has its own weights to the hidden units. This means the matrix of weights between the hidden layer and visible layer is 3 dimensional, instead of 2, and now it is obvious how to calculate the contributions.