optimizationconstraintslinear-programmingz3pycvxpy

Constraint-based optimizing the decision threshold of a prediction model


I am working on optimizing the decision threshold of a trained decision model for a binary case (targets 0 and 1). I want to optimize the decision threshold of the model, i.e. the point at which the model's probability prediction is interpreted as positive. Example:

prediction is 0.7, default threshold is 0.5 -> positive prediction because 0.7 > 0.5

I know that I can use techniques such as looking at the false-positive ratio to determine a good threshold value. However, I want to do it using constraints.

Specifically, I want to optimize the following equation and corresponding constraints:

Equations showing cost minimization target over the number of false positives and false negatives

Here, c_fp are the costs for a false positive count, I is an indicator function, and tau is the decision threshold. threshold_fp is the total allowed costs for false positive predictions. Likewise for the false negatives (fn). f(x_i) is my model's f prediction, and y_i is the ground truth for input sample x_i.

I am struggling to express this as an optimization problem in cvxpy.

The main challenge for me is: how can I express the total count of false positives (likewise for the false negatives)? This essentially boils down to how to encode a false positive? Here's what I came up with so far:

import numpy as np
import cvxpy as cp

predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])

c_fp = 100
c_fn = 1

false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))

current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")

threshold = cp.Variable()

false_positives = cp.sum(cp.multiply((predicted_probabilities >= threshold), (y_test == 0)))

false_negatives = cp.sum(cp.multiply((predicted_probabilities < threshold), (y_test == 1)))

cost = c_fp * false_positives + c_fn * false_negatives

problem = cp.Problem(cp.Minimize(cost))

problem.solve()

print(f"Optimal threshold: {threshold.value}")

But I am always coming back to this error: TypeError: float() argument must be a string or a real number, not 'Inequality'

Searching SO, I found a question about this TypeError in CVXPY: float() argument must be a string or a number, not 'Inequality'

This however deals with a different problem.

After doing some experimenting with this, I appreciate input on how I can model this as an optimization problem. I am aware of the rules of expressing AND, OR as constraints (https://cs.stackexchange.com/questions/12102/express-boolean-logic-operations-in-zero-one-integer-linear-programming-ilp/43884#43884), but I can't get my head around how to connect that to use in my optimization case.

Your support here is appreciated.

==============

Edit: Thanks to Alex Kemper's answer, I was able to solve this optimization problem using z3py:

#  https://ericpony.github.io/z3py-tutorial/guide-examples.htm
from copy import copy
from z3 import z3
import numpy as np

predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
y_test = np.array([0, 0, 1, 0, 1, 0, 1])

c_fp = 100
c_fn = 10

false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))

current_cost = c_fp * false_positives + c_fn * false_negatives
print(f"Current cost: {current_cost}")
print(f"False positives: {false_positives}")
print(f"False negatives: {false_negatives}")

n = len(y_test)
iRange = range(n)

threshold = z3.Real("threshold")

false_positives = z3.Sum([(predicted_probabilities[i] >= threshold) for i in iRange if (y_test[i] == 0)])
false_negatives = z3.Sum([(predicted_probabilities[i] < threshold) for i in iRange if y_test[i] == 1])
cost = c_fp * false_positives + c_fn * false_negatives

opt = z3.Optimize()
opt.minimize(cost)
if opt.check() == z3.sat:
    m = opt.model()
    
    t = m[threshold].as_decimal(10)
    t = float(t)
    print(f"Optimal threshold: {t}")

    false_positives = np.sum((predicted_probabilities >= t) & (y_test == 0))
    false_negatives = np.sum((predicted_probabilities < t) & (y_test == 1)) 
    cost = c_fp * false_positives + c_fn * false_negatives
    print(f"Optimal cost: {cost}")
    print(f"False positives: {false_positives}")
    print(f"False negatives: {false_negatives}")


Solution

  • Switching variables map inequalities to Booleans

    My solution uses a Boolean array above[] as auxiliary variable to tell, which probabilities are above the threshold. This allows to count the false positives and false negatives.

    import numpy as np
    import cvxpy as cp
    
    predicted_probabilities = np.array([0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9])
    y_test = np.array([0, 0, 1, 0, 1, 0, 1])
    
    c_fp = 100
    c_fn = 1
    
    false_positives = np.sum((predicted_probabilities >= 0.5) & (y_test == 0))
    false_negatives = np.sum((predicted_probabilities < 0.5) & (y_test == 1))
    
    current_cost = c_fp * false_positives + c_fn * false_negatives
    print(f"Current cost: {current_cost}")
    
    threshold = cp.Variable()
    
    n = len(y_test)
    iRange = range(n)
    
    #  store the differences between probabilities and threshold as array
    comparisons = [ (predicted_probabilities[i] - threshold) for i in iRange ]
    #  see https://stackoverflow.com/a/79083660/1911064
    M = 10000 # big M - a number that is larger than any possible value of comparisons[]
    above = cp.Variable(n, boolean=True)
    
    # above[i] can only be zero if comparisons[i] is non-positive
    constraints = [cp.pos(comparisons[i]) <= M * above[i] for i in iRange]
    # above[i] can only be one if comparisons[i] is non_negative
    constraints += [cp.neg(comparisons[i]) <= M * (1-above[i]) for i in iRange] 
    
    false_positives = cp.sum([above[i] for i in iRange if (y_test[i] == 0)])
    false_negatives = cp.sum([1 - above[i] for i in iRange if (y_test[i] == 1)])
    cost = c_fp * false_positives + c_fn * false_negatives
    
    objective = cp.Minimize(cost)                         
    problem = cp.Problem(objective, constraints)
    
    problem.solve()
    
    print(f"Optimal threshold: {threshold.value}")
    
    
    

    I wonder if the constraints can be written more elegantly by using element-wise functions.


    z3py as alternative to cvxpy

    Using z3py, the following short model would be sufficient:

    #  https://ericpony.github.io/z3py-tutorial/guide-examples.htm
    from z3 import z3
    
    predicted_probabilities = [0.3, 0.4, 0.7, 0.2, 0.8, 0.8, 0.9]
    y_test =                  [  0,   0,   1,   0,   1,   0,   1]
    
    c_fp = 100
    c_fn = 1
    
    n = len(y_test)
    iRange = range(n)
    
    threshold = z3.Real("threshold")
    
    false_positives = z3.Sum([(predicted_probabilities[i] > threshold) for i in iRange if (y_test[i] == 0)])
    false_negatives = z3.Sum([(predicted_probabilities[i] <= threshold) for i in iRange if y_test[i] == 1])
    cost = c_fp * false_positives + c_fn * false_negatives
    
    opt = z3.Optimize()
    opt.minimize(cost)
    if opt.check() == z3.sat:
        m = opt.model()
        print(f"Optimal threshold: {m[threshold]}")