from 東大1S情報α
- $
- It seems that it can be interpreted as the Information Quantity of the correct label (?)
- It also seems that it can be interpreted as the negative likelihood function (?)
- I’m not sure, but I might understand it later (blu3mo)
-
This is the Loss Function used in Class classification.
-
Multiply the natural logarithm by the probability and calculate the sum.
- Use ln(a)+ln(b) = ln(a*b) to avoid multiplication by using log.
-
The smaller the product of multiplying all the probabilities (values of x below, 0~1), the larger the cross entropy becomes.
-
When the probability is close to 0, the loss is high.
-
When the probability is close to 1, the loss is low.
-
Also, one of the reasons is that it becomes nice when differentiated (probably).
-
In PyTorch, it is nn.CrossEntropyLoss()
- This combines LogSoftMax and NLLLoss.
- LogSoftmax and NLLLoss give the same result.