python algorithm math permutation sublist

Finding Distinct Sublists with Target Sums

I am currently working on a task that involves identifying distinct sublists from a given list such that each sublist adds up to one of the specified target numbers.

Below is the Python code I've written to address this problem. The primary approach in this recursive function involves iteratively removing elements from the list and assigning them to a sublist corresponding to one of the target numbers or discarding them. The function terminates either when a solution is found and outputs the result or when it's unable to complete due to an empty list or overshooting in one of the sublists.

def subset_sums(numbers, target_array, total_partial=None, partial_sum=None):
    # If it's the first function call, initialize partial_sum and enumerate numbers to detect duplicates
    if partial_sum is None:
        numbers = [(v, i) for i, v in enumerate(numbers)]
        total_partial = [[] for _ in range(len(target_array))]
        partial_sum = np.zeros(len(target_array))

    # If all sublists have the correct sum, yield the resulting sublists
    if (target_array == partial_sum).all():
        yield total_partial
        return

    # If we don't reach a correct result and have no numbers left, stop the function
    elif not numbers:
        return

    # Get the next number and the remaining list
    n_with_index = numbers[0]
    n = n_with_index[0]
    remaining = numbers[1:]

    # Case 1: Skip the new number and continue with the rest of the numbers
    yield from subset_sums(remaining, target_array, total_partial, partial_sum)

    # Case 2: Use the new number for each possible list
    for j in range(len(target_array)):
        # If using the new number would overshoot in that list, stop
        if (partial_sum[j] + n) > target_array[j]:
            return
        # Otherwise, use the new number and continue with the rest of the numbers
        else:
            next_total_partial = total_partial
            next_total_partial[j] = next_total_partial[j] + [n_with_index]
            next_partial_sum = partial_sum
            next_partial_sum[j] = next_partial_sum[j] + n
            yield from subset_sums(remaining, target_array, next_total_partial, next_partial_sum)

However, I've encountered a persistent flaw in the code that I can't seem to resolve. The problem lies in the fact that the same list element gets appended to different sublists, and the algorithm fails to exclude list elements as expected. I've thoroughly reviewed the code, but I can't pinpoint why this issue persists.

The following snippet shows an example instance:

In [1]: list(subset_sums2([1,3,1,3], np.array([3,5])))
Out[1]: []

However, I expect the output:

Out[1]: [
    [[(3, 1)], [(1, 0), (1, 2), (3, 3)]], # target 3 is the 3 at index 1; target 5 is the sum of all other numbers 
    [[(3, 3)], [(1, 0), (1, 2), (3, 1)]]] # target 3 is the 3 at index 3; target 5 is the sum of all other numbers

Note that the output is (value, index) pairs. Here we have two ways of getting the target numbers of 3 & 5: They're identical except for which given 3 is used to achieve the 3 target vs the 5 target.

I would greatly appreciate any assistance that could help me identify and rectify the problem in my implementation. Thank you in advance for your help :)

Solution

In this section:

    for j in range(len(target_array)):
        # If using the new number would overshoot in that list, stop
        if (partial_sum[j] + n) > target_array[j]:
            return
        # Otherwise, use the new number and continue with the rest of the numbers
        else:
            next_total_partial = total_partial
            next_total_partial[j] = next_total_partial[j] + [n_with_index]
            next_partial_sum = partial_sum
            next_partial_sum[j] = next_partial_sum[j] + n
            yield from subset_sums(remaining, target_array, next_total_partial, next_partial_sum)

When your function is called with the example you provided:

list(subset_sums([1,3,1,3], np.array([3,5])))

Even on the second iteration there's already a problem. On the first iteration, next_total_partial is set to total_partial, and then next_total_partial is updated on the next line. But this modifies both total_partial and next_total_partial, as they have the same value.

So, on the second iteration, you think you reset next_total_partial with:

            next_total_partial = total_partial

But in fact, nothing changes - it still has the same object as a value and you now add the same value ((3, 3) in this case) to both next_total_partial and total_partial.

Perhaps you wanted next_total_partial = total_partial.copy()? The same applies to next_partial_sum = partial_sum.copy() of course.

It would be possible to fix your logic to get it to work, but it's not a very efficient approach and it has a few other problems:

the main issue is what I pointed out: you pass around the same objects where you need copies
you have a recursive function, but you also use it for initialisation - this would be better with a secondary outer function, or an inner function
you're brute-forcing a solution when there's more efficient ways to achieve this

An example of a working solution that fixes some of the issues, but has the same approach you chose:

from copy import deepcopy


def find_sums(target_sums, xs):
    def _all_sums(enumerated_xs, sums, grouping):
        if not enumerated_xs:
            yield grouping
            return

        i, v = enumerated_xs[0]
        for j in range(len(sums)):
            if sums[j] + v <= target_sums[j]:
                new_grouping = deepcopy(grouping)
                new_grouping[j] += [(i, v)]
                yield from _all_sums(enumerated_xs[1:], sums[:j] + [sums[j] + v] + sums[j+1:], new_grouping)

    if sum(target_sums) != sum(xs):
        return

    yield from _all_sums(list(enumerate(xs)), [0]*len(target_sums), [[] for _ in range(len(target_sums))])


print(list(find_sums([3, 5], [1, 3, 1, 3])))

Even closer, and perhaps preferable to you:

from copy import deepcopy


def find_sums(target_sums, xs):
    def _all_sums(enumerated_xs, sums, grouping):
        if not enumerated_xs:
            yield grouping
            return

        i, v = enumerated_xs[0]
        for j in range(len(sums)):
            if sums[j] + v <= target_sums[j]:
                new_grouping = deepcopy(grouping)
                new_grouping[j] += [(i, v)]
                new_sums = sums.copy()
                new_sums[j] += v
                yield from _all_sums(enumerated_xs[1:], new_sums, new_grouping)

    if sum(target_sums) != sum(xs):
        return

    yield from _all_sums(list(enumerate(xs)), [0]*len(target_sums), [[] for _ in range(len(target_sums))])


print(list(find_sums([3, 5], [1, 3, 1, 3])))