pytorchcross-entropy

Why does the pytorch crossEntropyLoss use label encoding, instead of one-hot encoding?


I'm learning on CrossEntropyLoss module in pytorch. And the tutor says, you should input target value y with 'label encoded', not 'one-hot encoded'. Like this

loss = nn.CrossEntropyLoss()
Y = torch.tensor([0])
Y_pred_good = torch.tensor([[2.0, 1.0, 0.1]])  
Y_pred_bad = torch.tensor([[0.5, 1.0, 0.3]])

l1 = loss(Y_pred_good, Y)
l2 = loss(Y_pred_bad, Y)

print(l1.item())
print(l2.item())

But I learned that CrossEntropy Loss is caculated with one-hot encoded class information. Does the pytorch module transform label encoded into one-hot encoded? or is there another way to caculate CELoss with label encoded information?


Solution

  • There's a difference between the multi-label CE loss, nn.CrossEntropyLoss, and the binary version, nn.BCEWithLogitsLoss.

    For the binary case, the implemented loss allows for "soft labels" and thus requires the binary targets to be floats in the range [0, 1].
    In contrast, nn.CrossEntropyLoss works with "hard" labels, and thus does not need to encode them in a one-hot fashion.

    If you do the math for the multi-class cross-entropy loss, you'll see that it is inefficient to have a one-hot representation for the targets. The loss is -log p_i where i is the true label. One only need to index the proper entry in the predicted probabilities vector. This can be done via multiplication be the one-hot encoded targets, but it is much more efficient to do it be indexing the right entry.


    Note: It seems like recent versions of nn.CrossEntropyLoss also support one-hot encoded targets ("smooth labels").