algorithmperformancemathor-toolsconstraint-programming

Fast code to determine if any two subsets of columns have the same sum


For a given n and m I iterate over all n by m partial circulant matrices with entries that are either 0 or 1. I want to find if there is a matrix such that there are no two subsets of the columns that give the same sum. Here when we add columns we just do it elementwise. My current code uses constraint programming via ortools. However it is not as fast I would like. For n = 7 and m = 12 it takes over 3 minutes and for n = 10, m = 18 it doesn't terminate even though there are only 2^18 = 262144 different matrices to consider. Here is my code.

from scipy.linalg import circulant
import numpy as np
import itertools
from ortools.constraint_solver import pywrapcp as cs

n = 7
m = 12

def isdetecting(matrix):
    X = np.array([solver.IntVar(values) for i in range(matrix.shape[1])])
    X1 = X.tolist()
    for row in matrix:
        x = X[row].tolist()
        solver.Add(solver.Sum(x) == 0)
    db = solver.Phase(X1, solver.INT_VAR_DEFAULT, solver.INT_VALUE_DEFAULT)
    solver.NewSearch(db)
    count = 0
    while (solver.NextSolution() and count < 2):
        solution = [x.Value() for x in X1]
        count += 1
    solver.EndSearch()
    if (count < 2):
        return True

values = [-1,0,1]
solver = cs.Solver("scip")

for row in itertools.product([0,1],repeat = m):
    M = np.array(circulant(row)[0:n], dtype=bool)
    if isdetecting(M):
        print M.astype(int)
        break

Can this problem be solved fast enough so that n = 10, m = 18 can be solved?


Solution

  • One problem is that you are declaring the "solver" variable globally and it seems to confuse or-tools to reuse it many times. When moving it inside "isdetecting", then the (7,12) problem is solved much faster, in about 7 seconds (compared to 2:51 minutes for the original model). I haven't checked it for the larger problem, though.

    Also, it might be good idea to test different labelings (instead of solver.INT_VAR_DEFAULT and solver.INT_VALUE_DEFAULT), though binary value tend to be not very sensitive to different labelings. See the code for another labeling.

    def isdetecting(matrix):
       solver = cs.Solver("scip") # <----
       X = np.array([solver.IntVar(values) for i in range(matrix.shape[1])])
       X1 = X.tolist()
       for row in matrix:
           x = X[row].tolist()
           solver.Add(solver.Sum(x) == 0)
       # db = solver.Phase(X1, solver.INT_VAR_DEFAULT, solver.INT_VALUE_DEFAULT)
       db = solver.Phase(X1, solver.CHOOSE_FIRST_UNBOUND, solver.ASSIGN_CENTER_VALUE)    
       solver.NewSearch(db)
       count = 0
       while (solver.NextSolution() and count < 2):
           solution = [x.Value() for x in X1]
           count += 1
       solver.EndSearch()
       if (count < 2):
           print "FOUND"
           return True
    

    Edit: Here are constraints to remove the all-0 solutions as mentioned in the comments. What I know of, it require a separate list. It now takes a little longer (10.4s vs 7s).

    X1Abs = [solver.IntVar(values, 'X1Abs[%i]' % i) for i in range(X1_len)]
    for i in range(X1_len):
        solver.Add(X1Abs[i] == abs(X1[i])) 
    solver.Add(solver.Sum(X1Abs) > 0)