I am working on optimizing the decision threshold of a trained decision model for a binary case (targets 0 and 1). I want to optimize the decision threshold of the model, i.e. the point at which the model's probability prediction is interpreted as positive. Example:
prediction is 0.7, default threshold is 0.5 -> positive prediction because 0.7 > 0.5
I know that I can use techniques such as looking at the false-positive ratio to determine a good threshold value. However, I want to do it using constraints.
Specifically, I want to optimize the following equation and corresponding constraints:
Here, c_fp
are the costs for a false positive count, I
is an indicator function, and tau
is the decision threshold. threshold_fp
is the total allowed costs for false positive predictions. Likewise for the false negatives (fn)
. f(x_i)
is my model's f
prediction, and y_i
is the ground truth for input sample x_i
.
I am struggling to express this as an optimization problem in cvxpy
.
The main challenge for me is: how can I express the total count of false positives
(likewise for the false negatives
)? This essentially boils down to how to encode a false positive? Here's what I came up with so far:
import numpy as np
import cvxpy as cp
predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])
c_fp = 100
c_fn = 1
false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))
current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")
threshold = cp.Variable()
false_positives = cp.sum(cp.multiply((predicted_probabilities >= threshold), (y_test == 0)))
false_negatives = cp.sum(cp.multiply((predicted_probabilities < threshold), (y_test == 1)))
cost = c_fp * false_positives + c_fn * false_negatives
problem = cp.Problem(cp.Minimize(cost))
problem.solve()
print(f"Optimal threshold: {threshold.value}")
But I am always coming back to this error: TypeError: float() argument must be a string or a real number, not 'Inequality'
Searching SO, I found a question about this TypeError in CVXPY: float() argument must be a string or a number, not 'Inequality'
This however deals with a different problem.
After doing some experimenting with this, I appreciate input on how I can model this as an optimization problem. I am aware of the rules of expressing AND, OR as constraints (https://cs.stackexchange.com/questions/12102/express-boolean-logic-operations-in-zero-one-integer-linear-programming-ilp/43884#43884), but I can't get my head around how to connect that to use in my optimization case.
Your support here is appreciated.
==============
Edit: Thanks to Alex Kemper's answer, I was able to solve this optimization problem using z3py
:
# https://ericpony.github.io/z3py-tutorial/guide-examples.htm
from copy import copy
from z3 import z3
import numpy as np
predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])
c_fp = 100
c_fn = 10
false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))
current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")
print(f"False positives: {false_positives}")
print(f"False negatives: {false_negatives}")
n = len(y_test)
iRange = range(n)
threshold = z3.Real("threshold")
false_positives = z3.Sum([(predicted_probabilities[i] >= threshold) for i in iRange if (y_test[i] == 0)])
false_negatives = z3.Sum([(predicted_probabilities[i] < threshold) for i in iRange if y_test[i] == 1])
cost = c_fp * false_positives + c_fn * false_negatives
opt = z3.Optimize()
opt.minimize(cost)
if opt.check() == z3.sat:
m = opt.model()
t = m[threshold].as_decimal(10)
t = float(t)
print(f"Optimal threshold: {t}")
false_positives = np.sum((predicted_probabilities >= t) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < t) & (y_test == 1))
cost = c_fp * false_positives + c_fn * false_negatives
print(f"Optimal cost: {cost}")
print(f"False positives: {false_positives}")
print(f"False negatives: {false_negatives}")
My solution uses a Boolean array above[]
as auxiliary variable to tell, which probabilities are above the threshold. This allows to count the false positives
and false negatives
.
import numpy as np
import cvxpy as cp
predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])
c_fp = 100
c_fn = 1
false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))
current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")
threshold = cp.Variable()
n = len(y_test)
iRange = range(n)
# store the differences between probabilities and threshold as array
comparisons = [ (predicted_probabilities[i] - threshold) for i in iRange ]
# see https://stackoverflow.com/a/79083660/1911064
M = 10000 # big M - a number that is larger than any possible value of comparisons[]
above = cp.Variable(n, boolean=True)
# above[i] can only be zero if comparisons[i] is non-positive
constraints = [cp.pos(comparisons[i]) <= M * above[i] for i in iRange]
# above[i] can only be one if comparisons[i] is non_negative
constraints += [cp.neg(comparisons[i]) <= M * (1-above[i]) for i in iRange]
false_positives = cp.sum([above[i] for i in iRange if (y_test[i] == 0)])
false_negatives = cp.sum([1 - above[i] for i in iRange if (y_test[i] == 1)])
cost = c_fp * false_positives + c_fn * false_negatives
objective = cp.Minimize(cost)
problem = cp.Problem(objective, constraints)
problem.solve()
print(f"Optimal threshold: {threshold.value}")
I wonder if the constraints can be written more elegantly by using element-wise functions.
Using z3py, the following short model would be sufficient:
# https://ericpony.github.io/z3py-tutorial/guide-examples.htm
from z3 import z3
predicted_probabilities = [0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9]
y_test = [ 0, 0, 1, 0, 1, 0, 1]
c_fp = 100
c_fn = 1
n = len(y_test)
iRange = range(n)
threshold = z3.Real("threshold")
false_positives = z3.Sum([(predicted_probabilities[i] > threshold) for i in iRange if (y_test[i] == 0)])
false_negatives = z3.Sum([(predicted_probabilities[i] <= threshold) for i in iRange if y_test[i] == 1])
cost = c_fp * false_positives + c_fn * false_negatives
opt = z3.Optimize()
opt.minimize(cost)
if opt.check() == z3.sat:
m = opt.model()
print(f"Optimal threshold: {m[threshold]}")