pythonmachine-learningscikit-learnaucprecision-recall

How to get the area under precision-recall curve


I am printing the classification report. the code I am using is printing the AUC value for the ROC curve but not for the precision-recall curve (where it is only plotting a graph). How to get the AUC value for the precision-recall curve?

df_test = pd.read_csv("D:/a.csv")
df_testPred = pd.read_csv("D:/b.csv")     
y_true1 = df_test["anomaly"].values[:-1]
y_score1 = df_testPred["anomaly_scores"].values[:-1]
y_pred1 = df_testPred["anomaly"].values[:-1].astype(int)     
ap1 = average_precision_score(y_true1, y_score1)
auc1 = roc_auc_score(y_true1, y_score1)
print(f"ap: {ap1}")
print(f"AUC: {auc1}")

print(classification_report(y_true1, y_pred1))

precision1, recall1, thresholds1 = precision_recall_curve(y_true1, y_score1)
#plt.plot([0, 1], [0, 1],'r--')
plt.plot(recall1, precision1)

Solution

  • Since you have already calculated precision1 and recall1, you can simply use the relevant scikit-learn function auc (docs):

    from sklearn.metrics import auc
    auc_score = auc(recall1, precision1)
    

    See ROC Curves and Precision-Recall Curves for Imbalanced Classification (although, according to my experience, the precision-recall AUC is not as widely used compared to the more usual ROC AUC).