
Comparing MSE loss and cross-entropy loss in terms of convergence

For a very simple classification problem where I have a target vector [0,0,0,....0] and a prediction vector [0,0.1,0.2,....1] would cross-entropy loss converge better/faster or would MSE loss? When I plot them it seems to me that MSE loss has a lower error margin. Why would that be? enter image description here

Or for example when I have the target as [1,1,1,1....1] I get the following: enter image description here


  • You sound a little confused...

    On top of these, your plot choice, with the percentage (?) of predictions in the horizontal axis, is puzzling - I have never seen such plots in ML diagnostics, and I am not quite sure what exactly they represent or why they can be useful...

    If you like a detailed discussion of the cross-entropy loss & accuracy in classification settings, you may have a look at this answer of mine.