[SOLVED] How to find MSE when using a batch loader?

How to find MSE when using a batch loader?

I'm working on a regression task using deep learning models. While calculating the MSE, I have divided by the length of the dataset. However, ChatGPT is suggesting me to divide it by the length of the loader. What is the correct way to calculate MSE in this case?

Also, I want to track the train_mse to see if the model is overfitting. While calculating the train_mse, should the model be in train or eval mode?

Code:

def train_and_evaluate(model, train_loader, val_loader, num_epochs=40, lr=1e-3, weight_decay=1e-5, patience=5):
    model = model.to(device)
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=lr, weight_decay=weight_decay)
    
    best_val_mse = float('inf')
    epochs_no_improve = 0

    for epoch in range(num_epochs):
        model.train()
        train_loss, train_preds, train_targets = 0, [], []
        for features, targets in train_loader:
            features, targets = features.to(device), targets.to(device)
            preds = model(features)
            loss = criterion(preds, targets)

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            train_loss += loss.item()
            train_preds.extend(preds.detach().cpu().numpy())
            train_targets.extend(targets.cpu().numpy())

        model.eval()
        val_loss, val_preds, val_targets = 0, [], []
        with torch.no_grad():
            for features, targets in val_loader:
                features, targets = features.to(device), targets.to(device)
                preds = model(features)
                loss = criterion(preds, targets)

                val_loss += loss.item()
                val_preds.extend(preds.detach().cpu().numpy())
                val_targets.extend(targets.cpu().numpy())

        train_mse = train_loss / len(train_dataset)
        train_pc = safe_pearsonr(train_preds, train_targets)
        val_mse = val_loss / len(val_dataset)
        val_pc = safe_pearsonr(val_preds, val_targets)

Solution

Your MSE computation is within the "for each batch" loop, but you are summing all the MSE into val_mse. Thus, at the end of the day you have added one mean value per batch.

You should divide by the number of batch per epoch, that correspond to the loader length, ChatGPT is correct.

For your second question, you are already computing the MSE for train as it is your loss. If it was not your loss, you could compute it in eval mode.