[SOLVED] Pytorch's `binary_cross_entropy` seems to implement ln(0) = -100. Why?

Pytorch's `binary_cross_entropy` seems to implement ln(0) = -100. Why?

I'm curious as to why Pytorch's binary_cross_entropy function seems to be implemented in such a way to calculate ln(0) = -100.

The binary cross entropy function from a math point of view calculates:

H = -[ p_0*log(q_0) + p_1*log(q_1) ]

In pytorch's binary_cross_entropy function, q is the first argument and p is the second.

Now suppose I do p = [1,0] and q = [0.25, 0.75]. In this case, F.binary_cross_entropy(q,p) returns, as expected: -ln(0.25) = 1.386.

If we reverse the function arguments and try F.binary_cross_entropy(p,q), this should return an error, since we would try calculating -0.75*ln(0), and ln(0) is in the limit -infinity.

Nonetheless, if I do F.binary_cross_entropy(p,q) it gives me 75 as the answer (see below):

> import torch.nn.functional as F 
> pT = torch.Tensor([1,0]) 
> qT =torch.Tensor([0.25,0.75]) 
> F.binary_cross_entropy(pT,qT)

tensor(75.)

Why it was implemented in this way?

Solution

It is indeed filling the value with -100. You can find an example of that here.

This is most likely a hack to avoid an error caused by accidental rounding to zero.

Technically speaking, the input probabilities to binary_cross_entropy are supposed to be generated by a sigmoid function, which is bounded asymptotically between (0, 1). This means the input should never actually be zero, but this may occur due to numerical precision issues for very small values.