pythonmachine-learningpytorch

Neural Net: Loss decreasing, but accuracy stays exactly the same


I am new to PyTorch and just tried to build my first Neural Network on the MNIST dataset. In particular, I wanted to followed this tutorial: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

My training code looks like this:

for epoch in range(5): # a epoch is one iteration over all the data
    
        running_loss = 0.0
        for i, data in enumerate(train_loader, 0):
            # get the inputs (this is a batch, so lists of images & labels)
            inputs, labels = data
            
            # reset the gradient
            optimizer.zero_grad()
            
            # forward + backward + optimize
            outputs = net(inputs) # forward
            loss = criterion(outputs, labels) # loss
            loss.backward() # backward
            optimizer.step() # update
            
            # print statistics
            running_loss += loss.item()
            if i % 2000 == 1999: # print every 2000 mini-batches
                print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
                running_loss = 0.0
            
        # compute accuracy after each epoch
        correct = 0
        total = 0

        with torch.no_grad():
            for data in train_loader:
                images, labels = data
                output = net(images)
                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()

        print(correct/total)

print('Training finished')

The output now confuses me, as the loss seems to decrease significantly, but the accuracy remains completely unchanged on thousands of samples. I would at least expect a change in the accuracy (not an increase), as the weights seem to change judging by the loss.

[1,  2000] loss: 1.902
[1,  4000] loss: 1.413
[1,  6000] loss: 1.188
[1,  8000] loss: 0.844
[1, 10000] loss: 0.666
[1, 12000] loss: 0.631
[1, 14000] loss: 0.614
[1, 16000] loss: 0.534
[1, 18000] loss: 0.557
[1, 20000] loss: 0.518
[1, 22000] loss: 0.448
[1, 24000] loss: 0.450
[1, 26000] loss: 0.466
[1, 28000] loss: 0.493
[1, 30000] loss: 0.412
[1, 32000] loss: 0.404
0.09857142857142857
[2,  2000] loss: 0.442
[2,  4000] loss: 0.421
[2,  6000] loss: 0.429
[2,  8000] loss: 0.423
[2, 10000] loss: 0.411
[2, 12000] loss: 0.426
[2, 14000] loss: 0.425
[2, 16000] loss: 0.400
[2, 18000] loss: 0.430
[2, 20000] loss: 0.376
[2, 22000] loss: 0.385
[2, 24000] loss: 0.376
[2, 26000] loss: 0.388
[2, 28000] loss: 0.433
[2, 30000] loss: 0.346
[2, 32000] loss: 0.357
0.09857142857142857
[3,  2000] loss: 0.393
[3,  4000] loss: 0.356
[3,  6000] loss: 0.396
[3,  8000] loss: 0.381
[3, 10000] loss: 0.350
[3, 12000] loss: 0.368
[3, 14000] loss: 0.405
[3, 16000] loss: 0.355
[3, 18000] loss: 0.367
[3, 20000] loss: 0.355
[3, 22000] loss: 0.357
[3, 24000] loss: 0.366
[3, 26000] loss: 0.362
[3, 28000] loss: 0.393
[3, 30000] loss: 0.336
[3, 32000] loss: 0.333
0.09857142857142857
[4,  2000] loss: 0.372
[4,  4000] loss: 0.323
[4,  6000] loss: 0.362
[4,  8000] loss: 0.368
[4, 10000] loss: 0.346
[4, 12000] loss: 0.345
[4, 14000] loss: 0.381
[4, 16000] loss: 0.363
[4, 18000] loss: 0.357
[4, 20000] loss: 0.337
[4, 22000] loss: 0.363
[4, 24000] loss: 0.343
[4, 26000] loss: 0.353
[4, 28000] loss: 0.390
[4, 30000] loss: 0.298
[4, 32000] loss: 0.343
0.09857142857142857
[5,  2000] loss: 0.350
[5,  4000] loss: 0.324
[5,  6000] loss: 0.361
[5,  8000] loss: 0.350
[5, 10000] loss: 0.307
[5, 12000] loss: 0.347
[5, 14000] loss: 0.372
[5, 16000] loss: 0.347
[5, 18000] loss: 0.356
[5, 20000] loss: 0.302
[5, 22000] loss: 0.339
[5, 24000] loss: 0.345
[5, 26000] loss: 0.340
[5, 28000] loss: 0.405
[5, 30000] loss: 0.310
[5, 32000] loss: 0.352
0.09857142857142857
Training finished

The full notebook can be found here: https://www.kaggle.com/code/adrianhayler/building-my-first-neural-net-with-pytorch


Solution

  • I found the problem:

    _, predicted = torch.max(outputs.data, 1)
    

    has to be changed to:

    _, predicted = torch.max(output.data, 1)
    

    outputs is the output of the forward pass and not the samples we iterate over.