pythonalgorithmperformancepython-itertoolscombinatorics

Variations with repetition of r integers in {0...k} that sum to u


Given a set of integers x = {0...k}, I need to find the most efficient algorithm to generate all variations with repetition of r integers x that sum to u.

I thought to

from itertools import product
import numpy as np
import pandas as pd

k = 10
r = 5
u = 12
x = np.arange(0, k+1)

prod = product(x, repeat=r)
df = pd.DataFrame(list(prod))
print(f"VR{x.size, r} = {df.index.size} = (k+1)^r = {(k+1)**r}\n")

df = df[df.sum(axis=1)==u]
print(df)
VR(11, 5) = 161051 = (k+1)^r = 161051

         0  1  2  3   4
32       0  0  0  2  10
42       0  0  0  3   9
52       0  0  0  4   8
62       0  0  0  5   7
72       0  0  0  6   6
...     .. .. .. ..  ..
146652  10  0  2  0   0
147742  10  1  0  0   1
147752  10  1  0  1   0
147862  10  1  1  0   0
149072  10  2  0  0   0

[1795 rows x 5 columns]

But this is highly inefficient because the total number of variations with repetition is VR(k+1, r) = (k+1)^r and generates a giant df.

Any suggestion?


Solution

  • To avoid creating all productions, you could bail out as soon as the sum can no longer be attained by adding remaining values to complete an r-tuple.

    Here is a plain Python function that does that. I didn't compare its execution time with other solutions, but with the example parameters it returns a list within 0.003 seconds when I ran it. Of course, printing those entries will take time too:

    def solve(greatest, count, target):
        collected = [0] * count
        if count <= 0 or count == 1 and target > greatest:
            return
        
        def recur(count, target):
            if count == 1:  # base case
                collected[-1] = target
                yield tuple(collected)
                return
            i = count - 1
            # Clip the range to values that still allow solutions
            for val in range(max(0, target - i * greatest), min(target, greatest) + 1):
                collected[-count] = val
                yield from recur(i, target - val)
                
        yield from recur(count, target)
    
    # example run
    from time import time
    start = time()
    result = list(solve(10, 5, 12))
    print("done", time() - start)
    print(result)