In Tensorflow, I'm getting outputs like 0.602129 or 0.663941. It appears that values closer to 0 imply a better model, but it seems like perplexity is supposed to be calculated as 2^loss, which implies that loss is negative. This doesn't make any sense.
This does not make a lot of sense to me. Perplexity is calculated as 2^entropy
. And the entropy is from 0 to 1. So your results which are < 1 do not make sense.
I would suggest you to take a look at how your model calculate the perplexity because I suspect there might be an error.