pythonperformancenumpypermutationinteger-partition

Unique permutations of fixed length integer partitions where each element has a maximum


This question is similar to a question I had several months ago: Generating a numpy array with all combinations of numbers that sum to less than a given number. In that question, I wanted to generate all numbers that summed to at most a constant, given that each element had a certain maximum.

This time I want to calculate all permutations that sum up to exactly that constant. This can be regarded as calculating the unique permutations of integer partitions where each element has a certain maximum. The end result should be stored in a numpy array.

Using a generator, a one liner achieves what we want:

import numpy as np
from itertools import product
K = 3 
maxRange = np.array([1,3,2]) 

states = np.array([i for i in product(*(range(i+1) for i in maxRange)) if sum(i)==K]) 

giving

array([[0, 1, 2],
       [0, 2, 1],
       [0, 3, 0],
       [1, 0, 2],
       [1, 1, 1],
       [1, 2, 0]])

I'm having quite slow performance when K=20 and maxRange = [20]*6. The number of permutations is limited with 53130, but it already takes 20 seconds. My gut feeling tells me this should takes much less than a second.

Does anyone have a faster solution available? I'm having trouble to modify the solution to my earlier question to account for this, because I don't know how to cut off the permutations for which it is no longer possible to add up to exactly K.

I don't mind solutions that use the @jit operator from numba... as long as they are faster than what I have now!

Thanks in advance.


Solution

  • I've had to think about this pretty long, but I've managed to modify the solution to Generating a numpy array with all combinations of numbers that sum to less than a given number for this problem:

    For the number of partitions, the idea is to calculate the array feasible_range that specifies how much we need at least in total at a certain stage to still reach max_sum. For example, if we want to reach a total of 3 and max_range[0] == 1, then we need to have at least 2 before starting with the final element. This array follows from a cumulative sum:

    feasible_range = np.maximum(max_sum - np.append(np.array([0]),np.cumsum(max_range)[:-1]),0)
    

    Now we can calculate the number of partitions as before, by setting the elements that can never lead to a feasible partition to 0.

    def number_of_partitions(max_range, max_sum):
        M = max_sum + 1
        N = len(max_range) 
        arr = np.zeros(shape=(M,N), dtype = int)    
        feasible_range = max_sum - np.append(np.array([0]),np.cumsum(max_range)[:-1])
        feasible_range = np.maximum(feasible_range,0)     
    
        arr[:,-1] = np.where(np.arange(M) <= min(max_range[-1], max_sum), 1, 0) 
        arr[:feasible_range[-1],-1] = 0
        for i in range(N-2,-1,-1):
            for j in range(max_range[i]+1):
                arr[j:,i] += arr[:M-j,i+1] 
            arr[:feasible_range[i],i]=0  #Set options that will never add up to max_sum at 0.
        return arr.sum(axis = 0),feasible_range
    

    The partition function also has a similar interpretation as before.

    def partition(max_range, max_sum, out = None, n_part = None,feasible_range=None):
        #Gives all possible partitions of the sets 0,...,max_range[i] that sum up to  max_sum. 
        if out is None:
            max_range = np.asarray(max_range, dtype = int).ravel()
            n_part,feasible_range = number_of_partitions(max_range, max_sum)
            out = np.zeros(shape = (n_part[0], max_range.size), dtype = int)
    
        if(max_range.size == 1):
            out[:] = np.arange(feasible_range[0],min(max_range[0],max_sum) + 1, dtype = int).reshape(-1,1)
            return out
    
        #Copy is needed since otherwise we overwrite some values of P.
        P = partition(max_range[1:], max_sum, out=out[:n_part[1],1:], n_part = n_part[1:],feasible_range=feasible_range[1:]).copy()      
    
    
        S = max_sum - P.sum(axis = 1) #The remaining space in the partition 
        offset, sz  = 0, 0
        for i in range(max_range[0]+1):
            #select indices for which there is remaining space
            #do this only if adding i brings us within the feasible_range.
            ind, = np.where(np.logical_and(S-i>=0,S-i <= max_sum-feasible_range[0])) 
            offset, sz = offset + sz, ind.size
            out[offset:offset+sz, 0] = i
            out[offset:offset+sz, 1:] = P[ind]
        return out
    

    For K=20 and maxRange = [20]*6, partition(maxRange,K) takes 13ms compared to 18.5 seconds first.

    I'm not really fond of the part where I have to copy; that can probably be avoided by reversing the ordering. The speed is good enough now, though.