I want to use sklearn.metrics.recall_score
to evaluate recall for a binary image segmentation task.
Doing this works:
threshold = 0.5
predicted_mask = (probability_map > threshold).astype(np.int)
actual_mask = actual_mask.astype(np.int)
result = recall_score(actual_mask.flatten(), predicted_mask.flatten())
This however:
result = recall_score(actual_mask, predicted_mask)
gives me the error:
ValueError: Target is multilabel-indicator but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted', 'samples'].
actual_mask
and predicted_mask
are numpy-arrays with integers of 0 and 1.
It is not obvious to me from the documentation that this should not work:
sklearn.metrics.precision_score(y_true, y_pred, *, labels=None, pos_label=1, average='binary', sample_weight=None, zero_division='warn')
y_true: 1d array-like, or label indicator array / sparse matrix
y_pred: 1d array-like, or label indicator array / sparse matrix
What am I missing? And more importantly: Is the recall-value that I obtain using the flatten
operation correct?
I know it's late but I will still answer since the documentation is not exactly clear. precision_score and recall_score do not treat 2D arrays as images. They treat each slice of them as individual prediction-ground truth pairs. Let's look at an example from the documentation.
>>> y_true = [[0, 0, 0], [1, 1, 1], [0, 1, 1]]
>>> y_pred = [[0, 0, 0], [1, 1, 1], [1, 1, 0]]
>>> recall_score(y_true, y_pred, average=None)
array([1. , 1. , 0.5])
recall_score does not treat y_true or y_pred as a single 3x3 matrices. Instead it will treat y_pred[0] as the prediction for class 0, y_pred[1] as the prediction for class 1, etc. That's the reason why average = 'binary' does not work.
Is the recall-value that I obtain using the flatten operation correct?
Assuming your predicted_mask and actual_mask have the shape (Height, Width) as in a typical image segmentation task, then yes it's correct.