I have a list of 500 elements each with 2 values x (float value from 0-Infinity) and a y (float value from 0-1). I need to find the cheapest (lowest sum(x)) combination of 10 of these elements that achieve an average y value less than a given n. I think this is some kind of knapsack problem but I'm unsure of how to fit solutions online to my exact problem and fear that it may be too complicated.
All I have tried so far is trying a solution from ChatGPT but I believe that its complexity is too high. I'm looking for a solution that hopefully can be completed with a high number of elements <1000.
def find_cheapest_combination(elements, k, target_avg):
def backtrack(start, combo):
if len(combo) == k:
# Calculate the average float value for the current combination
avg_float = sum(element[1] for element in combo) / k
if avg_float <= target_avg:
nonlocal best_combo, lowest_avg
best_combo = combo[:]
lowest_avg = avg_float
return
for i in range(start, len(elements)):
if len(elements) - i < k - len(combo):
# Not enough elements remaining to form a valid combination
break
combo.append(elements[i])
backtrack(i + 1, combo)
combo.pop()
elements.sort(key=lambda x: x[0]) # Sort by price in ascending order
best_combo = []
lowest_avg = float('inf')
backtrack(0, [])
return best_combo
elements = [(x, y) for x, y in items]
k = 10
target_avg = 0.07
cheapest_combination = find_cheapest_combination(elements, k, target_avg)
print(cheapest_combination)
You can try this:
import numpy as np
from scipy.optimize import milp, LinearConstraint, Bounds
def f(x, y, n, m):
x, y = np.array(x), np.array(y)
A = np.vstack([np.ones_like(x), y])
ub = [n, n * m]
lb = [n, -np.inf]
res = milp(x,
constraints=LinearConstraint(A, ub=ub, lb=lb),
integrality=np.ones_like(x),
bounds=Bounds(lb=0, ub=1)
)
if res.success:
choices = np.nonzero(res.x)[0]
return {"choices": choices,
"x_sum": res.fun,
"y_mean": y[choices].mean()}
This function takes as its arguments arrays x
and y
of x- and y-values, respectively. n
is the number of elements to be selected, and m
is the value that bounds the mean of selected y-values from above. If a solution to the problem exists, the function returns a dictionary of the form
{“choices”: indices of selected elements,
“x_sum”: the sum of x-values of selected elements,
“y_mean”: the mean of y-values of selected elements}
It a solution cannot be found (i.e. no selection of elements gives small enough mean of y-values), the function returns None
.
For example:
f(x=[1, 2, 3, 4], y=[1, 10, 2, 1], n=2, m=2)
gives:
{'choices': array([0, 2]), 'x_sum': 4.0, 'y_mean': 1.5}
Explanation. Let x = [a_1,…,a_k]
, y=[b_1,…,b_k]
. The problem of selecting some elements of x
with the smallest sum is equivalent to finding a minimum of the function a_1*x_1 + … + a_k*x_k
where the value of each variable x_i
can be either 1 (which means that a_i
is selected) or 0. The variables x_i
must satisfy a couple constraints. First, x_1 + … + x_k = n
, since we want to select exactly n
elements. Second, b_1*x_1 + … + b_k*x_k <= n*m
. This condition means that the average of selected b_i
’s does not exceed m
. In this form, the problem becomes an integer linear program. The code uses scipy.optimize.milp
to compute its solution.