I'm trying to optimize a binary problem for a website of mine.
The data contains roughly 75 items and each of the items has a weight (between 50 and 1000) and a price attached to it. Here's a data snippet:
{"weighting":{
"0":500,
"1":50,
"2":50,
"3":50,
"4":250,
"5":1000
},
"price":{
"0":4,
"1":78,
"2":75,
"3":170,
"4":5,
"5":4
}
}
I calculate the expected value of the whole data set with
exp_val = (w1 p1 + w2 p2 + ... + wn pn) / sum(w1 + w2 + ... wn)
with
sum(w1 + w2 + ... wn) = 23665 (considering all items)
So far so good, but now comes the tricky part. Not all items are desired, that is, they are worth less and / or have a high weighting which dilutes the pool I can draw from.
By "blocking" or removing up to 3 items I can draw from the remaining items only, and by doing so maximizing the expedted value function. The question is: Which items to remove? As the prices vary over time, I have to check the items to remove on a regular basis.
I have started by simply removing the items with the highest weights and the smallest price, but I'm sure this only represents a local optimum and there would be a more optimal strategy.
After checking some websites, it seems that Mixed-integer linear programming (MILP) or in particular BILP (binary ...) can solve my issue, and I found https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.milp.html but wasn't able to make it work, as I'm stuck translating my problem into code. Can anyone help me out?
Here's my solution:
1. Make the results trackable
In cpmodel from ortools you can do so by introducing a class with
from ortools.sat.python import cp_model as cp
class VarArraySolutionPrinter(cp.CpSolverSolutionCallback):
"""Print intermediate solutions."""
def __init__(self, variables):
cp.CpSolverSolutionCallback.__init__(self)
self.__variables = variables
self.__solution_count = 0
def on_solution_callback(self):
self.__solution_count += 1
for v in self.__variables:
print('%s=%i' % (v, self.Value(v)), end=' ')
print()
def solution_count(self):
return self.__solution_count
and than structure your code like this
# ############################################
# content of your objective function goes here
# ############################################
# solve the model
solver = cp.CpSolver()
status = solver.Solve(model)
# https://developers.google.com/optimization/cp/cp_solver?hl=de
# debugging
solution_printer = VarArraySolutionPrinter([objective, xi_wi_pi, xi_wi])
solver.parameters.enumerate_all_solutions = True
# inspect the solution
objective_function_value = solver.ObjectiveValue()
solution_info = solver.SolutionInfo()
status = solver.Solve(model, solution_printer)
note that in
solution_printer = VarArraySolutionPrinter([objective, xi_wi_pi, xi_wi])
you want to add the variable names, i.e. using the third argument (str) when creating a variable:
xi_wi = model.NewIntVar(0, 100, "xi_wi")
2. Create the model
With that out of the way I found that I didn't had to follow Jonis advice because cpmodel from ortool can handel binary variables itself. This code works for me:
from ortools.sat.python import cp_model as cp
# w_default = weighting
# chaos = price
data = {"w_default":{
"0":500,
"1":50,
"2":50,
"3":50,
"4":250,
"5":1000
},
"chaos":{
"0":4,
"1":78,
"2":75,
"3":170,
"4":5,
"5":4
}
}
model = cp.CpModel()
num_items = len(data["w_default"])
# create boolean coefficients
dv_select_items = {i: model.NewBoolVar("item_" + str(i)) for i in data["w_default"]}
# constraint: remove exactly 3 items
model.Add(sum(dv_select_items[i] for i in dv_select_items) == num_items - 3)
# ##### numerator equation #####
constant = 1000
# x_i * w_i * p_i // sum of weightings * prices = 200.000 -> UB 500.000 to give some space?
xi_wi_pi = model.NewIntVar(50000 * constant, 500000 * constant, "xi_wi_pi")
model.Add(xi_wi_pi == sum(
dv_select_items[i] * data["w_default"][i] * data["chaos"][i] * constant for i in dv_select_items))
##### denominator equation #####
# xi_wi // sum of weightings 23665: 20665 with 3 blocked
lb_weights = int(tot_weight * 0.75)
xi_wi = model.NewIntVar(lb_weights, tot_weight, "xi_wi")
model.Add(xi_wi == sum(dv_select_items[i] * data["w_default"][i] for i in dv_select_items))
objective = model.NewIntVar(0, 100 * constant, "objective")
model.AddDivisionEquality(objective, xi_wi_pi, xi_wi)
# set target
model.Maximize(objective)
# solve the model
solver = cp.CpSolver()
status = solver.Solve(model)
# https://developers.google.com/optimization/cp/cp_solver?hl=de
# debugging
solution_printer = VarArraySolutionPrinter([objective, xi_wi_pi, xi_wi])
solver.parameters.enumerate_all_solutions = True
# inspect the solution
objective_function_value = solver.ObjectiveValue()
solution_info = solver.SolutionInfo()
status = solver.Solve(model, solution_printer)
3. Scale the nominator because AddDivisionEquality round to integers
You might be wondering why I have added a constant in the code (it doesn't work without it). This is because
model.AddDivisionEquality(objective, xi_wi_pi, xi_wi)
always rounds the result to an integer value and because the results are in the range of 8.something the objective function always returned 8. However, if you multiply the numerator with 1000, 8.3456 now becomes 8345 and 8.4334 becomes 8433 and thus can be evaluated correctly.
Hope this helps anyone with a similar problem. Also, many thanks to Joni and Barthendu for pointing me in the right direction!