pythonoptimizationmemorycartesian

How to reduce memory consumption when performing a cartesian product?


Given a 2d matrix such as [[a,b,c],[d,e,f]...]], I want to perform a Cartesian product of the matrix so I can determine all the possible combinations.

For this particular constraint, when I am using a 2d matrix with 12 different subsets, it uses more than the 16 megabytes of allotted memory I have. There are three values in each subset, so I would have 312 different combinations.

The cartesian product function that I am using is:

def cartesian_iterative(pools):
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    return result

I would like to know how I could reduce memory consumption without using any external libraries. An example 2d array I would working with is [['G', 'H', 'I'], ['M', 'N', 'O'], ['D', 'E', 'F'], ['D', 'E', 'F'], ['P', 'R', 'S'], ['D', 'E', 'F'], ['M', 'N', 'O'], ['D', 'E', 'F'], ['D', 'E', 'F'], ['M', 'N', 'O'], ['A', 'B', 'C'], ['D', 'E', 'F']]

EDIT: For reference, a link to the problem statement can be found here Problem Statement. Here is the link to the file of possible names Acceptable Names.

The final code:

with open('namenum.in','r') as fin:
    num = str(fin.readline().strip()) #the number being used to determine all combinations

numCount = []
for i in range(len(num)):
    numCount.append(dicti[num[i]]) #creates a 2d array where each number in the initial 'num' has a group of three letters


def cartesian_iterative(pools): #returns the product of a 2d array
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    return result

pos = set() #set of possible names
if len(num) == 12: #only uses more than the allocated memory when the num is 12 digits long.
    '''
    This optimization allows the product to only calculate 2 * 3^6 values, instead of 3**12. This saves a lot of memory
    '''
    rights = cartesian_iterative(numCount[6:])
    for left in cartesian_iterative(numCount[:6]):
        for right in rights:
            a = ''.join(left+right)
            if a in names:
                pos.add(a) #adding name to set
else: #if len(num) < 12, you do not need any other optimizations and can just return normal product 
    for i in cartesian_iterative(numCount):
        a = ''.join(i)
        if a in names:
            pos.add(a)
pos = sorted(pos)


with open('namenum.out','w') as fout: #outputting all possible names
    if len(pos) > 0:
        for i in pos:
            fout.write(i)
            fout.write('\n')
    else:
        fout.write('NONE\n')

Solution

  • You could use that function on left and right half separately. Then you'd only have 2×36 combinations instead of 312. And they're half as long, somewhat even canceling that factor 2.

    for left in cartesian_iterative(pools[:6]):
        for right in cartesian_iterative(pools[6:]):
            print(left + right)
    

    Output:

    ['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'D']
    ['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'E']
    ['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'A', 'F']
    ['G', 'M', 'D', 'D', 'P', 'D', 'M', 'D', 'D', 'M', 'B', 'D']
    ...
    

    To be faster, compute the right combinations only once:

    rights = cartesian_iterative(pools[6:])
    for left in cartesian_iterative(pools[:6]):
        for right in rights:
            print(left + right)