I know this problem may have been asked by many. My challenge is that my list is long in length (say 100+) and my target may also be big. The recursive algorithm takes 10 minutes to complete. I need any one subset that makes it closest to the target-sum
My Code
combination_dict = {}
def subset_sum(numbers, target, partial=[]):
s = sum(partial)
if s == target:
print("sum({})={}".format(partial, target))
return
if s >= target:
partial.pop()
combination_dict[sum(partial)] = partial
return
for i in range(len(numbers)):
n = numbers[i]
remaining = numbers[i + 1:]
subset_sum(remaining, target, partial + [n])
if __name__ == "__main__":
lst = [100, 100, 100, 100, 100, 100, 100] #my actual list is really long len=100(max)
tgt = 434
subset_sum(lst, tgt)
closest_sum = max([i for i in combination_dict.keys()])
print("Closest possible sum={} and the combination is ={}".format(closest_sum, combination_dict[closest_sum]))
If my list length is > 20, it is stuck. I understand this has a time complexity of O(2^n)- but can someone please help me get this optimized to get my result in less than 30 seconds. My list can have floats. And so be the target-sum.
You can use integer linear programming where each variable is a binary that corresponds to a number and represents whether that number is included in the result. It solves two problems, the first approaching the target value from below and the second from above and then takes the best solution. The following is an example implementation using PuLP:
import numpy as np
from pulp import LpMaximize, LpMinimize, LpProblem, lpSum, LpVariable
rng = np.random.default_rng(seed=0)
numbers = rng.integers(1, 10**6, size=10**4)
target = int(numbers.mean() * rng.normal(loc=1, scale=0.1))
indices = range(len(numbers))
variables = LpVariable.dicts("Choice", indices, cat="Binary")
leq_prob = LpProblem('leq', LpMaximize)
geq_prob = LpProblem('geq', LpMinimize)
leq_prob += lpSum([variables[i]*numbers[i] for i in indices]) <= target
leq_prob += lpSum([variables[i]*numbers[i] for i in indices])
geq_prob += lpSum([variables[i]*numbers[i] for i in indices]) >= target
geq_prob += lpSum([variables[i]*numbers[i] for i in indices])
leq_prob.solve()
leq_choices = [numbers[i] for i in indices if variables[i].value() == 1]
if sum(leq_choices) == target:
solution = leq_choices
else:
geq_prob.solve()
geq_choices = [numbers[i] for i in indices if variables[i].value() == 1]
solution = (
leq_choices
if target-sum(leq_choices) <= sum(geq_choices)-target
else geq_choices
)
print(f'Solution: {solution}')
print(f'Sum: {sum(solution)} (target: {target})')