Scipy Minimize with boolean decision matrix

I have a distance matrix between 2 sets of points ( sets A & B ). Each A point is associated with an equal number of B points ( if possible ).

I want to minimize the standard deviation of average distances between A points and their B associated points.

My objective function is as follow :

from scipy.optimize import minimize

def objective( x , distances ):

  decision_matrix = x.reshape( distances.shape[0] , distances.shape[1] )

  avg_distances = np.sum( distances * decision_matrix , axis = 1 ) / np.sum( decision_matrix , axis = 1 )

  std_distances = np.std( avg_distances )

  return std_distances

where x is a decision matrix with 0 or 1 : 0 no association between A point and B point, 1 association.

I would like to implement the following constraints :

an element of x matrix is 0 or 1
summing elements from x column = 1 : no orphan B point and when a B point is associated to a A point it cannot be associated with another A point
each A point should be associated with the same number of B points ( 2 A points and 10 B points -> each A point will be associated with 5 B points )

My purpose is to obtain the A points, B points pairing under the stated constraints minimizing the objective function.

Could some give me hints on how to accomplish that goal with scipy package ? I have some difficulties implementing the constraints. Thanks.

Solution

Here is a partial solution assuming that you are open to adjusting the objective function such that the problem can be expressed as a binary integer LP.

For instance, if you wanted to minimize the sum of distances between the corresponding A-B points:

import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize, linalg
rng = np.random.default_rng(35982456879456)
n_a = 3
n_b = 9
n_a2b = n_b / n_a

# A_... is an equality constrant matrix
# B_... is equality_constraint RHS
# a has to do with points in group "a"
# b has to do with point in group "b"
a = rng.random((2, 3, 1))
b = rng.random((2, 1, 9))

distances = np.sum((a - b)**2, axis=0).ravel()

# This linear system represents the constraint that each
# point in group "a" is assigned to exactly `n_a2b` points
# in group "b"
ones = np.ones(n_b)
A_a_to_n = linalg.block_diag(ones, ones, ones)
B_a_to_n = np.full(n_a, n_a2b)

# This linear system represents the constraint that each
# point in group "b" is assigned to exactly one points
# in group "a"
ones = np.ones((n_a, 1))
A_b_to_1 = linalg.block_diag(*([ones]*n_b)).reshape(n_b, -1)
B_b_to_1 = np.ones(n_b)

# For the full linear system
A = np.vstack((A_a_to_n, A_b_to_1))
B = np.concatenate((B_a_to_n, B_b_to_1))

# Solve
integrality = np.ones(distances.size)
res = optimize.milp(distances, constraints=(A, B, B), 
                    integrality=integrality)

# Plot
colors = ['r', 'g', 'b']
for i_a, i_b in zip(*np.where(res.x.reshape(3, -1))):
    print(i_a, i_b)
    plt.plot(a[0, i_a, 0], a[1, i_a, 0], colors[i_a]+'o')
    plt.plot(b[0, 0, i_b], b[1, 0, i_b], colors[i_a]+'.')

Plot of example clusters. The large circles are the "A" points, the small dots are the "B" points, and the colors indicate association between them.

It sounds like you are more interested in minimizing the variance (/standard deviation) between the within-cluster average distances. If you are willing to accept a one-norm based measure of spread (instead of variance/standard deviation), I think you can fit it into the framework above.

For instance, suppose you want to minimize the maximum discrepancy between the biggest within-cluster average distance and smallest within-cluster average distance.

You can create three new (floating point) decision variables and constrain each of them be equal to one of the within-cluster average distances. Let's call them d1, d2, d3.

Create three new decision variables that represent the pairwise difference between these. For instance: d12 = d2 - d1, d13 = d3 - d1, and d23 = d3 - d2. Also create d21, d31, and d32, the negatives of these.

Create a final new decision variable d_max that represents the maximum of all of these (d12, d21, d13, d31, d23, d32). You can ensure that it is the maximum by constraining it to be greater than each of them individually.

Set your objective function equal to d_max.