pythondriverless-ai

What/Where are the final threshold(s) used to score new predictions during deployment in DAI?


I have a binary classification problem, ran it through DAI, and am given test AUC's. Where do I find the probability threshold that is used during deployment to score new rows of data?

An example would be a threshold of .50; I.E a target variable >.50 gets a 1, and target<.50 gets a 0 (or vice versa) during a decision. I need the exact threshold beyond the 4 digit concatenated number that is shown in the GUI as you move across the AUC curve. In the pictures below I've matched the thresholds and can't get the same confusion matrices with the identical threshold. Notice the False Positives are very minimally different.

DAI Threshold @ .0301 Threshold

Sklearn Confusion Matrix @ .0301 Threshold

UPDATED ANSWER: Download the "Experiments Summary" tab after a completed experiment on DAI. Within the zip file you'll find an ensemble_roc_test.json that gives thresholds up to 10 digits.


Solution

  • Driverless AI predictions return 'scores', so thresholding is not applied to the prediction and it is up to the user how they use the score. It is possible to see the recommended threshold to optimize different metrics on the experiment page ROC curve. For example, in the screenshot below the mouse is hovering over the 'Best F1' circle to get a summary that includes the threshold:

    enter image description here