I am training a model for brain extraction on MR images. I use 2D U-Net architecture and pretrained EfficientNetV2B3.
def dice_coefficients(y_true, y_pred, smooth=0):
intersection = K.sum(y_true * y_pred)
union = K.sum(y_true) + K.sum(y_pred)
return (2 * intersection + smooth) / (union + smooth)
def dice_coefficients_loss(y_true, y_pred, smooth=0):
return -dice_coefficients(y_true, y_pred, smooth)
def iou(y_true, y_pred, smooth=0):
intersection = K.sum(y_true * y_pred)
sum = K.sum(y_true + y_pred)
iou = (intersection + smooth) / (sum - intersection + smooth)
return iou
### COMPILE THE MODEL
opti = Adam(learning_rate=1e-4)
unet_model5.compile(optimizer=opti,
loss=dice_coefficients_loss,
metrics=["accuracy", iou, dice_coefficients])
# Early Stopping
early_stopping_unet = EarlyStopping(monitor='loss',
patience=10)
# Learning Rate Adjustment
reduce_lr_unet = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
patience=5, min_lr=1e-7)
model_history = unet_model5.fit(train_generator, batch_size = 16, epochs=200,validation_data=val_generator, callbacks = [early_stopping_unet, reduce_lr_unet])
Evolution of Dice coefficient loss:
Evolution of Dice coefficient:
Evolution of accuracy:
My dice and accuracy values are pretty much high but I reckon it's going to be better if I could figure out the problem of the model, if there's any. I tried changing some hyperparameters like learning_rate, dropout rates, layer numbers. I can't increase batch_size higher than 16 then it gives me error resource exhausted but I am using 3060 Ti and I made sure tensorflow uses my gpu. I added batchnorm, val_accuracy won't increase otherwise, for some reason I don't understand. It would be amazing if you could explain. I also changed my pretrained model to others such as VGG16, InceptionResNetV2, and preprocessed input accordingly then finally EfficientNetV2B3 which gave the best result. (I am also using early stopping and reduce lr callbacks)
Why does accuracy increase so instant while dice coefficients go up slow and slow? And how can I get accuracy over being stuck at 0.95?
Let me try to answer the part of your question, Why does accuracy increase so instant while dice coefficients go up slow and slow? I guess this is not a problem/error of your model or implementation. I would say, in fact, it is not a "problem" at all, but rather a property of your metrics:
dice_coefficients()
and iou()
implementations only ever count the foreground pixels but never the total number of pixels in your image.Now assume you have a segmentation mask with only few foreground pixels in relation to the image size; say, your correct segmentation mask is [0, 0, 0, 0, 0, 0, 0, 0, 1, 1]
. If your network prediction would now be all-zeros ([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
), it would already have an accuracy of 0.8 (because 80% of pixels got correctly assigned). The Dice and IoU values, however, would both be zero, because no foreground pixel has been assigned correctly, yet.
The following example demonstrates this with a few more iterations (I used NumPy for calculations, but that shouldn't make a difference):
import matplotlib.pyplot as plt
import numpy as np
def dice_coefficients(y_true, y_pred, smooth=0):
intersection = np.sum(y_true * y_pred)
union = np.sum(y_true) + np.sum(y_pred)
return (2 * intersection + smooth) / (union + smooth)
def accuracy(y_true, y_pred, threshold=0.5):
y_pred = y_pred > threshold
return (y_true == y_pred).sum() / y_pred.size
y_true = np.asarray([0, 0, 0, 0, 0, 0, 0, 0, 1, 1])
# Round 1: Predict everything as ones
y_pred = np.asarray([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
print("Acc.", a1 := accuracy(y_true, y_pred))
print("Dice", d1 := dice_coefficients(y_true, y_pred))
# Round 2: Predict everything as zeros
y_pred = np.asarray([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
print(a2 := accuracy(y_true, y_pred))
print(d2 := dice_coefficients(y_true, y_pred))
# Round 3: Predict one foreground pixel correctly
y_pred = np.asarray([0, 0, 0, 0, 0, 0, 0, 0, 0, 1])
print(a3 := accuracy(y_true, y_pred))
print(d3 := dice_coefficients(y_true, y_pred))
# Round 4: Predict all foreground pixels correctly
y_pred = np.asarray([0, 0, 0, 0, 0, 0, 0, 0, 1, 1])
print(a4 := accuracy(y_true, y_pred))
print(d4 := dice_coefficients(y_true, y_pred))
plt.plot([1, 2, 3, 4], [a1, a2, a3, a4], label="Acc.")
plt.plot([1, 2, 3, 4], [d1, d2, d3, d4], label="Dice")
plt.legend()
plt.show()
Bottom line: Try to understand and get a feeling for the metrics that you use. If they don't behave as you expect, it might sometimes not be an issue with your implementations but with your expectations.