When testing a binary classifier I get an accuracy of 83% (when the threshold is set to 0.5) however when I workout the ROC and AUC I get an AUC value of 1, which I believe is incorrect as in this case I should be getting an accuracy of 100?
I have the following data (first 5 points for example):
True labels:
true_list = [1. 1. 1. 1. 1.]
Thresholded predictions
pred_list = [0. 0. 1. 1. 1.]
Raw sigmoid activated output
pred_list_raw = [0.23929074 0.34403923 0.61575216 0.72756131 0.69771088]
The code used to generate the data from the model is:
output_raw = Net(images)
output = torch.sigmoid(output_raw)
pred_tag = torch.round(output)
[pred_list.append(pred_tag[i]) for i in range(len(pred_tag.squeeze().cpu().numpy()))]
[pred_list_raw.append(output[i]) for i in range(len(output.squeeze().cpu().numpy()))]
The ROC and AUC values are calculated using the sklearn metrics with following code:
fpr, tpr, _ = metrics.roc_curve(true_list, pred_list_raw)
auc = metrics.roc_auc_score(true_list, pred_list_raw)
The values for accuracy and AUC do not seem consistent?
The full output datasets are below:
True labels:
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.]
Thresholded predictions
0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 0. 0.
0. 1. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 0. 0. 1. 0. 1.
1. 1. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 0. 1. 1. 0. 0. 1.
1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0.]
Raw sigmoid activated output
0.80731616 0.81613746 0.63055099 0.33343941 0.33650158 0.26123103
0.43023067 0.75951926 0.85506684 0.83753035 0.77401593 0.93755849
0.93669454 0.78037769 0.48196705 0.51107402 0.39020711 0.27603346
0.27125724 0.79841139 0.96470754 0.97575928 0.9520636 0.98686909
0.99421722 0.9814615 0.72548573 0.70952273 0.58558095 0.75391273
0.98747451 0.99592133 0.99348673 0.99636301 0.99966048 0.99927722
0.93512388 0.87108612 0.76195734 0.45464458 0.44979708 0.3798077
0.46179509 0.51260215 0.42887223 0.77441987 0.99320274 0.99899955
0.99885804 0.99888995 0.99996059 0.99992547 0.9893837 0.94771828
0.90216806 0.63214702 0.70693445 0.62402257 0.72597019 0.72850208
0.48136757 0.34587109 0.48912585 0.53809234 0.49571105 0.52119752
0.66452994 0.65721321 0.46201256 0.32531447 0.33560987 0.34733458
0.54707416 0.66652035 0.67211284 0.64667205 0.77259018 0.81139687
0.72141833 0.47555719 0.41060125 0.40072988 0.30013099 0.81335717
0.87926414 0.83410184 0.89994201 0.96761651 0.94806845 0.67343196
0.60651364 0.57781878 0.76253183 0.95988439 0.98643017 0.98208946
0.99291688 0.99853936 0.99570023 0.84561008 0.82329192 0.70751861
0.40768749 0.38326785 0.42332725 0.41978272 0.95580183 0.99577685
0.99589898 0.99182735 0.99963567 0.99949705 0.98161394 0.93502385
0.89946262 0.69163107 0.23587978 0.24273368 0.27152508 0.27938265
0.25957949 0.28954122 0.30340485 0.28367177 0.25412464 0.24931795
0.40110995 0.38143945 0.49271891 0.50662051 0.33616859 0.52061933
0.47093576 0.63511254 0.68877464 0.47989569 0.37947267 0.69217007
0.69413745 0.85119693 0.83831514 0.46003498 0.19595725 0.18322578
0.13161417 0.17004058 0.155272 0.1832541 0.13801674 0.17109324
0.16617284 0.16502231 0.16629275 0.17945219 0.18769069 0.19091081
0.19954858 0.17923033 0.18590597 0.17878488 0.19183244 0.15146982
0.16887138 0.17444615 0.18757529 0.15070279 0.19910241 0.15885526
0.18926985 0.19083846 0.1563857 0.19467271 0.19159289 0.21147205
0.12797629 0.17709421 0.19563617 0.1951601 0.12606692 0.20411101
0.17489395 0.179219 0.17770813 0.13888956 0.17316737 0.18813291
0.20011829 0.18280909 0.12445015 0.17259067 0.20987834 0.17725589
0.18583644 0.16768099 0.17385706 0.19005385 0.16527923 0.17264359
0.13370521 0.17153564 0.15309515 0.19745554 0.17381944 0.16110312
0.19662598 0.15733718 0.19763281 0.20617132 0.19089484 0.19732752
0.1870988 0.16508744 0.13579399 0.13825028 0.19650695 0.2028151
0.20796896 0.16130049 0.18487175 0.15657099 0.14414533 0.19415208
0.14158873 0.20252466 0.19986491 0.1761861 0.12490113 0.14082219
0.19325744 0.17937965 0.17161699 0.20017089 0.1953598 0.19116857
0.18963095 0.18015937 0.17033672 0.12995853 0.17816802 0.20537938
0.17656901 0.17246887 0.19970285 0.18360697 0.14851416 0.14957287
0.17847791 0.19361662 0.12858931 0.15501569 0.16153916 0.18401976
0.19767486 0.18276181 0.18216812 0.18459979 0.17810379 0.20029616
0.16008779 0.18842728 0.19535601 0.16842141 0.18356466 0.19130296
0.19826594 0.16606207 0.17985446 0.18720729 0.16947971 0.19309211
0.17904012 0.18225684 0.12697826 0.20334946 0.20230229 0.19601187
0.18372611 0.13250111 0.1508019 0.1991842 0.16360692 0.18059866
0.17001721 0.16149873 0.16174695 0.19311724 0.17267033 0.14393295
0.19088417 0.18659356]
It could be correct. Accuracy depends on threshold but auc doesn't. Auc is power of seperation of classes. Your auc = 1 means there is a threshold where classes are seperable perfectly, but obviously this threshold is not 0,5.