pythonmachine-learningdeep-learningpytorchloss

`CrossEntropyLoss()` in PyTorch


Cross entropy formula:

enter image description here

But why does the following give loss = 0.7437 instead of loss = 0 (since 1*log(1) = 0)?

import torch
import torch.nn as nn
from torch.autograd import Variable

output = Variable(torch.FloatTensor([0,0,0,1])).view(1, -1)
target = Variable(torch.LongTensor([3]))

criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss) # 0.7437

Solution

  • In your example you are treating output [0, 0, 0, 1] as probabilities as required by the mathematical definition of cross entropy. But PyTorch treats them as outputs, that don’t need to sum to 1, and need to be first converted into probabilities for which it uses the softmax function.

    So H(p, q) becomes:

    H(p, softmax(output))
    

    Translating the output [0, 0, 0, 1] into probabilities:

    softmax([0, 0, 0, 1]) = [0.1749, 0.1749, 0.1749, 0.4754]
    

    whence:

    -log(0.4754) = 0.7437