[SOLVED] How can the perplexity of a language model be between 0 and 1?

In Tensorflow, I'm getting outputs like 0.602129 or 0.663941. It appears that values closer to 0 imply a better model, but it seems like perplexity is supposed to be calculated as 2^loss, which implies that loss is negative. This doesn't make any sense.

This does not make a lot of sense to me. Perplexity is calculated as 2^entropy. And the entropy is from 0 to 1. So your results which are < 1 do not make sense.

I would suggest you to take a look at how your model calculate the perplexity because I suspect there might be an error.