I've trained a model and identified a 'threshold' that I'd like to deploy it at, but I'm having trouble understanding how the threshold relates to the score.
X = labeled_data[features].reset_index(drop=True)
Y = np.array(labeled_data['fraud'].reset_index(drop=True))
# (train/test etc.. settle on an acceptable model)
grad_des = SGDClassifier(alpha=alpha_optimum, l1_ratio=l1_optimum, loss='log')
grad_des.fit(X, Y)
score_Y = grad_des.predict_proba(X)
precision, recall, thresholds = precision_recall_curve(Y, score_Y[:,1])
Alright, so now I plot precision and recall vs threshold and decide I want my threshold to be .4
What is threshold?
My model coefficients, which I understand are 'scoring' events by computing coefficients['x']*event_values['x']
, sum up to 29. Threshold is between 0 and 1.
How am I to understand the translation from threshold to what is, I guess a raw score? Would an event with a 1
for all features (all are binary) have a calculated score of 29 since that is the sum of all coefficients?
Do I need to compute this 'raw' score metric for all events and then plot that against precision instead of threshold?
Edit and Update:
So my question hinged on a lack of understanding about the logistic function, as Mikhail Korobov pointed out below. Regardless of 'raw score' the logistic function forces a value in [0, 1] range.
In order to 'unwrap' that value back into the 'raw score' I was looking for, I can do scipy.special.logit(0.8) - grad_des.intercept_
and this returns the 'score' of the row.
Probabilities are not just coefficients['x']*event_values['x']
- a logistic function is applied to these scores to get probability values in [0, 1] range.
predict_proba method returns these probabilities; they are in range [0, 1].
To get a concrete yes/no prediction one have to choose a probability threshold. An obvious and sane way is to use 0.5: if probability is greater than 0.5 then predict "yep", predict "nope" otherwise. This is what .predict()
method does.
precision_recall_curve
tries different probability thresholds and computes precision and recall for them. If based on precision and recall scores you believe some other threshold is better for your application you can use it instead of 0.5, e.g. bool_prediction = score_Y[:,1] > threshold
.